Data Types Reference
Detailed reference information for the Data Types API.
Field Validation
Type Identifier
| Field | Constraints |
|---|---|
| Length | 1-100 characters |
| Format | Uppercased letters, numbers, underscores only |
| Uniqueness | Must be unique across all data types |
| Examples | ROUTING_NUMBER, API_KEY, CUSTOM_DATE_OF_BIRTH |
Name
| Field | Constraints |
|---|---|
| Length | 1-100 characters |
| Format | Any printable characters |
Description
| Field | Constraints |
|---|---|
| Length | 0-128 characters |
| Format | Any printable characters |
Regex Guidelines
When creating custom data types:
Best Practices
- Be specific - Avoid overly broad patterns that cause false positives
- Test thoroughly - Use regex testing tools (regex101.com) with sample data
- Use anchors - Consider word boundaries (
\b) to avoid partial matches - Capture groups - Use
valueGroupIndexto extract the sensitive portion
Common Patterns
Fixed-length numeric:
Prefix with letters:
Delimited format:
With word boundaries:
Content-Type Specific Patterns
JSON-Specific Pattern (JSON Query)
Detect SSN fields in JSON using a JSON query definition. The json field uses a custom query language, not JSONPath.
curl -X POST "https://your-shield-host:8080/api/datatypes" \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"type": "SSN_JSON",
"name": "SSN in JSON",
"description": "Detect SSN fields in JSON content",
"isGroupDataType": false,
"regexes": [
{
"json": "{\"Search\":{\"Key\":{\"Regex\":\"(?i)ssn|social\"},\"Value\":{\"String\":{\"Regex\":\"\\\\d{3}-\\\\d{2}-\\\\d{4}\"}}}}",
"valueGroupIndex": 0
}
]
}'
import requests
import json
BASE_URL = "https://your-shield-host:8080"
HEADERS = {"Authorization": "Bearer YOUR_API_KEY"}
# JSON query definition: Search keys like "ssn" or "social"
json_query = {
"Search": {
"Key": {"Regex": "(?i)ssn|social"},
}
}
datatype = {
"type": "SSN_JSON",
"name": "SSN in JSON",
"description": "Detect SSN fields in JSON content",
"isGroupDataType": False,
"regexes": [
{
"json": json.dumps(json_query),
"valueGroupIndex": 0
}
]
}
response = requests.post(f"{BASE_URL}/api/datatypes", headers=HEADERS, json=datatype)
print(f"Created data type: {response.json()['id']}")
const axios = require('axios');
const BASE_URL = 'https://your-shield-host:8080';
const HEADERS = { 'Authorization': 'Bearer YOUR_API_KEY' };
// JSON query definition: Search keys like "ssn" or "social"
const jsonQuery = {
Search: {
Key: { Regex: '(?i)ssn|social' },
}
};
const datatype = {
type: 'SSN_JSON',
name: 'SSN in JSON',
description: 'Detect SSN fields in JSON content',
isGroupDataType: false,
regexes: [
{
json: JSON.stringify(jsonQuery),
valueGroupIndex: 0
}
]
};
const response = await axios.post(`${BASE_URL}/api/datatypes`, datatype, { headers: HEADERS });
console.log(`Created data type: ${response.data.id}`);
HTML-Specific Pattern (XPath)
Detect email addresses in HTML anchor href attributes using XPath. The html field is an XPath expression, not regex.
curl -X POST "https://your-shield-host:8080/api/datatypes" \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"type": "EMAIL_HTML",
"name": "Email in HTML Links",
"description": "Detect email addresses in href attributes",
"isGroupDataType": false,
"regexes": [
{
"html": "//a/@href",
"valueGroupIndex": 0
}
]
}'
import requests
BASE_URL = "https://your-shield-host:8080"
HEADERS = {"Authorization": "Bearer YOUR_API_KEY"}
datatype = {
"type": "EMAIL_HTML",
"name": "Email in HTML Links",
"description": "Detect email addresses in href attributes",
"isGroupDataType": False,
"regexes": [
{
"html": "//a/@href", # XPath: all href attributes in anchor tags
"valueGroupIndex": 0
}
]
}
response = requests.post(f"{BASE_URL}/api/datatypes", headers=HEADERS, json=datatype)
print(f"Created data type: {response.json()['id']}")
const axios = require('axios');
const BASE_URL = 'https://your-shield-host:8080';
const HEADERS = { 'Authorization': 'Bearer YOUR_API_KEY' };
const datatype = {
type: 'EMAIL_HTML',
name: 'Email in HTML Links',
description: 'Detect email addresses in href attributes',
isGroupDataType: false,
regexes: [
{
html: '//a/@href', // XPath: all href attributes in anchor tags
valueGroupIndex: 0
}
]
};
const response = await axios.post(`${BASE_URL}/api/datatypes`, datatype, { headers: HEADERS });
console.log(`Created data type: ${response.data.id}`);
Built-In Data Types
Shield includes 48 built-in data types that cannot be modified or deleted. These types are automatically available and cover common sensitive data patterns.
Identity & Government IDs
| Type Identifier | Display Name | Description |
|---|---|---|
US_SSN |
US SSN | US Social Security Number (9 digits) |
CANADIAN_SIN |
Canadian SIN | Canadian Social Insurance Number |
US_ITIN |
US ITIN | Individual Taxpayer Identification Number |
US_ATIN |
US ATIN | Adoption Taxpayer Identification Number |
US_EIN |
US EIN | Employer Identification Number |
US_DRIVERS_LICENSE |
US Drivers License | US state driver's license numbers |
PASSPORT |
Passport | US Passport Number (8-9 digits) |
VIN |
VIN | Vehicle Identification Number (ISO 3779:2009) |
Financial
| Type Identifier | Display Name | Description |
|---|---|---|
CREDIT_CARD |
Credit Card | Credit card numbers (Visa, MasterCard, Amex, Discover, etc.) |
IBAN |
IBAN | International Bank Account Number |
SWIFT_CODE |
SWIFT Code | SWIFT/BIC codes for banks |
Contact & Network
| Type Identifier | Display Name | Description |
|---|---|---|
EMAIL_ADDRESS |
Email Address | RFC 5322 compliant email addresses |
PHONE_NUMBER |
Phone Number US | US formatted phone numbers |
URL |
URL | RFC 1630 compliant URLs |
IP |
IP | IPv4 addresses (RFC 3849) |
MAC_ADDRESS |
Mac Address | Media access control addresses |
DOMAIN |
Domain Name | DNS domain names with valid TLDs |
Cloud Credentials
| Type Identifier | Display Name | Description |
|---|---|---|
AWS_SECRET |
Secrets AWS | Amazon Web Services Access Keys and Secrets |
AZURE_SECRET |
Secrets Azure | Azure Access Keys and Secrets |
GOOGLE_CLOUD_SECRET |
Secrets GCP | Google Cloud Platform credentials and secrets |
Latin American Phone Numbers
| Type Identifier | Display Name | Description |
|---|---|---|
MEXICO_PHONE_NUMBER |
Phone Number Mexico | Mexico phone numbers |
ARGENTINA_PHONE_NUMBER |
Phone Number Argentina | Argentina phone numbers |
BRAZIL_PHONE_NUMBER |
Phone Number Brazil | Brazil phone numbers |
CHILE_PHONE_NUMBER |
Phone Number Chile | Chile phone numbers |
COLOMBIA_PHONE_NUMBER |
Phone Number Colombia | Colombia phone numbers |
VENEZUELA_PHONE_NUMBER |
Phone Number Venezuela | Venezuela phone numbers |
BOLIVIA_PHONE_NUMBER |
Phone Number Bolivia | Bolivia phone numbers |
ECUADOR_PHONE_NUMBER |
Phone Number Ecuador | Ecuador phone numbers |
PARAGUAY_PHONE_NUMBER |
Phone Number Paraguay | Paraguay phone numbers |
URUGUAY_PHONE_NUMBER |
Phone Number Uruguay | Uruguay phone numbers |
PERU_PHONE_NUMBER |
Phone Number Peru | Peru phone numbers |
SPAIN_PHONE_NUMBER |
Phone Number Spain | Spain phone numbers |
CRM-Specific Types
| Type Identifier | Display Name | Description |
|---|---|---|
HUBSPOT_NAME |
Hubspot Name | HubSpot contact names |
HUBSPOT_ADDRESS |
Hubspot Address | HubSpot addresses |
HUBSPOT_DATE_OF_BIRTH |
Hubspot Date of Birth | HubSpot date of birth fields |
HUBSPOT_EMAIL_MESSAGE |
Hubspot Email | HubSpot email message content |
HUBSPOT_EMAIL_TO |
Hubspot Email To | HubSpot email recipients |
HUBSPOT_EMAIL_CC |
Hubspot Email CC | HubSpot email CC recipients |
HUBSPOT_EMAIL_BCC |
Hubspot Email BCC | HubSpot email BCC recipients |
HUBSPOT_EMAIL_FROM |
Hubspot Email From | HubSpot email sender |
HUBSPOT_EMAIL_SENDER |
Hubspot Email Sender | HubSpot email sender info |
HUBSPOT_EMAIL_BODY |
Hubspot Email Body | HubSpot email body content |
HUBSPOT_EMAIL_SUBJECT |
Hubspot Email Subject | HubSpot email subjects |
HUBSPOT_EMAIL_MESSAGE_ID |
Hubspot Email Message ID | HubSpot email message IDs |
HUBSPOT_OBJECT_TIMESTAMP |
Hubspot Object Timestamp | HubSpot object timestamps |
SALESFORCE_NAME |
Salesforce Name | Salesforce contact names |
SALESFORCE_ADDRESS |
Salesforce Address | Salesforce addresses |
SALESFORCE_DATE_OF_BIRTH |
Salesforce Date of Birth | Salesforce date of birth fields |
Note: All built-in data types use the Type Identifier in API requests. For example, use "US_SSN" when referencing the US Social Security Number type.
Group Data Types
Group data types combine multiple types into a single logical unit:
Creating Groups
{
"type": "ALL_PII",
"name": "All PII",
"description": "All personally identifiable information",
"isGroupDataType": true,
"dataTypes": [
"US_SSN",
"CREDIT_CARD",
"US_PHONE_NUMBER",
"EMAIL_ADDRESS"
]
}
Behavior
- Cascade disable - Disabling a group disables all member types
- Detection - Any member type match triggers the group
- Obfuscation - Can apply different masks to each member type
- Nesting - Groups can contain other groups (be careful of circular references)
Value Group Index
The valueGroupIndex field specifies which regex capture group contains the sensitive value:
| Index | Meaning | Example |
|---|---|---|
0 |
Full match | EMP-123456 (entire match) |
1 |
First group | EMP-(123456) extracts 123456 |
2 |
Second group | (EMP)-(\d{6}) extracts digits |
Example Usage
This extracts only the SSN digits, not the "SSN:" prefix.
Best Practices
- Name clearly - Use descriptive names that indicate the data type
- Document patterns - Explain the format in the description field
- Test with real data - Verify patterns match expected values
- Group related types - Use group data types for easier management
- Version control patterns - Keep regex patterns in version control
- Avoid over-matching - Be as specific as possible to reduce false positives
- Use word boundaries - Prevent matching partial strings
- Escape special characters - Remember to escape regex metacharacters
Regex Testing
Before deploying custom data types:
- Test with regex101.com - Verify patterns match expected formats
- Test with Shield scanning API - Use the Data Scanning API to test detection
- Monitor false positives - Check Activities for unexpected matches
- Iterate and refine - Adjust patterns based on real-world results
Performance Considerations
- Simple patterns are faster - Avoid complex lookaheads/lookbehinds when possible
- Limit backtracking - Use atomic groups or possessive quantifiers
- Anchor patterns - Use
^,$, or\bto reduce scan scope - Test at scale - Validate performance with representative data volumes
Related Topics
- Create Data Type - Create custom data types
- Obfuscations API - Configure masking for these types
- Rules API - Apply detection rules
- Data Scanning API - Test data type detection
- Activities API - Query detections by data type