Search
Clear search
Close search
Google apps
Main menu

How to use predefined content detectors

Drive Data Loss Prevention (DLP) rules apply to G Suite Enterprise users only. 

As of Jan 31, 2017, Gmail data loss prevention (DLP) is available only with G Suite Enterprise. Customers who are licensed with G Suite Business on Mar 31, 2017 can continue to use Gmail DLP until Jan 31, 2020, provided they continuously renew their G Suite Business license during that time period.​

As an administrator, when you’re setting up a data loss protection (DLP) rule you can use content detectors to specify the types of sensitive content to scan and flag. Some rule templates contain predefined content detectors that automatically scan and flag this sensitive data. How the content is scanned and flagged depends on what type of content it is.

Detection methods

Predefined content detectors are built using publicly available information. Four principal detection methods are used: pattern match, context, checksum, and word and phrase list. The detector is set to flag content when only one method finds sensitive content or when all methods find sensitive content. 

  • Pattern match—A specific alphanumeric pattern (not just string length), including delimiters, valid position, and valid range checks

  • Context—Presence of relevant strings in proximity to a pattern, a checksum matching string, or both 
  • Checksum—Checksum computation and verification with check digit
  • Word and phrase list—Full or partial match to an entry found in a dictionary of words and phrases

Predefined content detectors

United States
Detector Description
Social Security Number

In the United States, a Social Security number (SSN) is a 9-digit number issued to U.S. citizens, permanent residents, and temporary residents. The Social Security number has become the de facto national identification number for taxation and other purposes.

Detection method: Pattern match or 9 digits with context

Context: SSN, Social, Social Security, Taxpayer

 

Driver's License Number

Driver’s license number for the United States. Format can vary depending on the issuing U.S. state.

Detection method: Pattern match and context

Context: Drive, Driving, Learn, Lic, License, Licence, Permit, DL

Drug Administration Enforcement (DEA) Number

A DEA number is assigned to a health care provider by the U.S. Drug Enforcement Administration. It allows the health care provider to write prescriptions for controlled substances. The DEA number is often used as a general "prescriber number" that is a unique identifier for anyone who can prescribe medication.

Detection method: Pattern match and checksum

ABA Routing Number

The American Bankers Association (ABA) Routing Number (also called the transit number) is a nine-digit code. It's used to identify the financial institution that's responsible to credit or entitled to receive credit for a check or electronic transaction.

Detection method: Checksum on 9 digits

National Provider Identifier (NPI)

The National Provider Identifier (NPI) is a unique 10-digit identification number issued to health care providers in the United States by the Centers for Medicare and Medicaid Services (CMS). The NPI has replaced the unique provider identification number (UPIN) as the required identifier for Medicare services. It's also used by other payers, including commercial healthcare insurers.

Detection method: Checksum on 10 digits

 

CUSIP A CUSIP number is a 9-character alphanumeric code that identifies a North American financial security. 

Detection method: Checksum or context (when check digit not present)

Context: CUSIP

FDA Approved Prescription Drugs This is any drug on the list of prescription drugs approved by the United States Food and Drug Administration (FDA).

Detection method: Word and phrase list

Passport

This is a detector for a United States passport.

Detection method: Pattern match and context

Context: United States, USA, Passport, Travel Document

United Kingdom
Detector Description
Driver's License Number

Driver’s license number for the United Kingdom of Great Britain and Northern Ireland (UK).

Detection method: Pattern match

National Health Service (NHS) Number

NHS numbers are the unique numbers allocated to registered users of the 3 public health services in England, Wales, and the Isle of Man.

Detection method: Pattern match and checksum

 

National Insurance Number (NINO)

The National Insurance number is a number used in the United Kingdom (UK) in the administration of the National Insurance or social security system. It identifies people. It's also used for some purposes in the UK tax system. The number is sometimes referred to as NI No or NINO.

Detection method: Pattern match (with delimiters) or Pattern match and context words

 

Passport

This is a detector for a United Kingdom (UK) passport.

Detection method: Pattern match and context

Context: United Kingdom, Passport, Travel Document

Taxpayer Identification Number

This is a detector for a United Kingdom (UK) Unique Taxpayer Reference (UTR) number. This number, comprised of a string of 10 decimal digits, is an identifier used by the UK government to manage the taxation system. Unlike other identifiers, such as the passport number or social insurance number, the UTR is not listed on official identity cards.

Detection method: Pattern match and context

Context: United Kingdom, Taxpayer, UTR

Australia
Detector Description
Medicare Account Number

A 9-digit Medicare number is issued to permanent residents of Australia (except for Norfolk island). The primary purpose of this number is to prove Medicare eligibility to receive subsidized care in Australia.

Detection method: Checksum and (pattern match or context)

Context: Medicare, Australia, IRN

Tax File Number (TFN)

A number issued by the Australian Tax Office for taxpayer identification. Every taxpaying entity, such as an individual or an organization, is assigned a unique number.

Detection method: Checksum and (pattern match or context)

Context: Tax File Number, TFN, Australian Tax Office

Brazil
Detector Description
CPF number

The Cadastro de Pessoas Físicas (CPF), which is Portuguese for "Natural Persons Register," is an 11-digit number used in Brazil for taxpayer identification.

Detection method: Checksum and (pattern match or context)

Context: CPF, Cadastro de Pessoas Físicas, Pessoas Físicas, Tax Number, Taxpayer

Canada
Detector Description
Quebec Health Insurance Number (HIN)

The Quebec Health Insurance Number (HIN) is issued to citizens, permanent residents, temporary workers, students and other individuals who are entitled to health care coverage in the Province of Quebec.

Detection method: Pattern match

 

Ontario Health Insurance Plan (OHIP)

The Ontario Health Insurance Plan (OHIP) number is issued to citizens, permanent residents, temporary workers, students, and other individuals who are entitled to health care coverage in the Province of Ontario.

Detection method: Pattern match and checksum

 

British Columbia Personal Health Number (PHN)

The British Columbia Personal Health Number (PHN) is issued to citizens, permanent residents, temporary workers, students, and other individuals who are entitled to health care coverage in the Province of British Columbia.

Detection method: Pattern match or 10 digits with context

Context: BC ID, PHN, British Columbia, Personal Health Number, Services Card, Canadian health insurance number, Canadian health ID

 

Social Insurance Number (SIN)

The Canadian Social Insurance Number (SIN) is the main identifier used in Canada for citizens, permanent residents, and those on work or study visas. With a Canadian SIN and mailing address, one can apply for health care coverage, driver’s licenses, and other important services.

Detection method: Checksum and (pattern match or context)

 

Passport

This is a detector for a Canadian passport.

Detection method: Pattern match and context

Context: Canada, Canadian, Numéro de passeport, Passport, Travel Document, document number

China
Detector Description
Passport

This is a detector for a Chinese passport.

Detection method: Pattern match and context

Context: China, Passport, 中华人民共和国护照, 护照号, Hùzhào hào, 护照

France
Detector Description
National ID Card (CNI)

The Carte Nationale d’Identité Sécurisée (CNI or CNIS) is the French national identity card. It's an official identity document consisting of a 12-digit ID number. This number is commonly used when opening bank accounts and when paying by check. It can sometimes be used instead of a passport or visa within the European Union (EU) and in some other countries.

Detection method: Pattern match and context

Context: CNI, CNIS (carte nationale d'identité securisée), identité, identite

Social Security Number (NIR)

The Numéro d'Inscription au Répertoire (NIR) is a permanent personal ID number that's also known as the French social security number for services including healthcare as well as pensions.

Detection method: Pattern match and checksum

Passport

This is a detector for a French passport.

Detection method: Pattern match and context

Context: France, Passport, Passeport, REPUBLIC FRANCAIS, Numéro de passeport

Germany
Detector Description
Passport

This is a detector for a German passport. The format of a German passport number is 10 alphanumeric characters, chosen from numerals 0-9 and letters C, F, G, H, J, K, L, M, N, P, R, T, V, W, X, Y, Z.

Detection method: Pattern match and context

Context: GERMANY, REISEPASS, PASSPORT, Europäische Union, Bundesrepublik, Deutschland, reisepassnummer

India
Detector Description
Personal Permanent Account Number (PAN)

The Personal Permanent Account Number (PAN) is a unique 10-digit alphanumeric identifier used for identification of individuals, particularly those who pay income tax. It's issued by the Indian Income Tax Department. The PAN is valid for the lifetime of the holder.

Detection method: Pattern match and context

Context: India, Account Number, PAN, Taxpayer ID

Japan
Detector Description
Passport

This is a detector for a Japanese passport. The passport number consists of 2 alphabetic characters followed by 7 digits.

Detection method: Pattern match and context

Context: Japan, Passport, パスポート,
パスポート番号

Korea
Detector Description
Passport

This is a detector for a Korean passport. There are 2 different formats.

Pre-2008 passport numbers consist of 9 characters. The first 2 characters are the issued local code, corresponding to the holder's gu, or district. The remaining 7 digits are the serial number.

Post-2008 passport numbers consist of 9 characters. The first character is either a single letter M, denoting PM passports, or the letter S for PS passports. The remaining 8 digits are the serial number.

Detection method: Pattern match and context

Context: Passport, Korea, 여권, 대한민국

Mexico
Detector Description
National Identification Number (CURP)

This is a detector for the Mexico Clave Única de Registro de Población (CURP) number. In English, this is the Unique Population Registry Code, or Personal ID Code Number. This is an 18-character state-issued identification number assigned by the Mexican government to citizens or residents of Mexico and used for taxpayer identification.

Detection method: Pattern match and context

Context: CURP, Clave Única, Población, Registro, UPRC, Personal ID, Registry Code

Passport

This is a detector for a Mexican passport.

Detection method: Pattern match and context

Context: Mexico, Passport, Pasaporte, México, Mexican

Netherlands
Detector Description
National Identification Number (BSN)

This is a detector for the Netherlands Burgerservicenummer (BSN). It's also known as a Citizen’s Service Number. It's a state-issued identification number that's on driver’s licenses, passports, and international ID cards.

Detection method: Checksum and (pattern match or context)

Context: BSN, Personal Number, Burgerservicenummer, Netherlands, Identification Number, Service Number, sofinummer, sofi, personalnummer

Spain
Detector Description
NIF Number

Número de Identificación Fiscal (NIF) numbers are government ID numbers for Spanish citizens. An NIF number is needed for key transactions such as opening a bank account, buying a car, or setting up a mobile phone contract.

Detection method: Checksum and (pattern match or context)

Context: Número de Identificación Fiscal, NIF

 

NIE Number

Número de Identificación de Extranjeros (NIE) numbers are government ID numbers for foreigners living or doing business in Spain. An NIE number is needed for key transactions such as opening a bank account, buying a car, or setting up a mobile phone contract.

Detection method: Checksum and (pattern match or context)

Context: Número de Identificación de Extranjeros, NIE

 

Passport

There are 4 different types of passports in Spain. This detector is for the Ordinary Passport (Pasaporte Ordinario) type, which is issued for ordinary travel, such as vacations and business trips.

Detection method: Pattern match and context

Context: Passport, Pasaporte, Espana, España, Spain

Global
Detector Description
Credit card number

Credit card numbers are 12 to 19 digits long. They're used for payment transactions globally.

Detection method: Pattern match and checksum

 

Bank account number (IBAN)

An International Bank Account Number (IBAN) is defined as an internationally agreed-upon method for identifying bank accounts. It’s defined by the International Standard of Organization (ISO) 13616:2007 standard. ISO 13616:2007 was created by the European Committee for Banking Standards (ECBS). An IBAN consists of up to 34 alphanumeric characters including elements such as a country code or account number.

Detection method: Pattern match and checksum

Bank account number (SWIFT)

A SWIFT code is the same as a Bank Identifier Code (BIC). It's a unique identification code for a particular bank. These codes are used when transferring money between banks, particularly for international wire transfers. Banks also use the codes for exchanging other messages.

Detection method: Pattern match and context

Context: SWIFT, ISO 9362, Business Identifier Code, BIC, Business Entity Identifier, BEI, bank, interbank

ICD 9-CM Lexicon

The International Classification of Diseases, Clinical Modification (ICD-9-CM) lexicon is used to assign diagnostic and procedure codes associated with inpatient, outpatient, and physician office use in the United States. It was created by the U.S. National Center for Health Statistics (NCHS). The ICD-9-CM is based on the ICD-9 but provides for additional morbidity detail. It's updated annually on October 1.

Detection method: Word and phrase list

ICD 10-CM Lexicon

Like ICD 9 codes, ICD 10 codes are a series of diagnostic codes published by the World Health Organization (WHO) to describe causes of morbidity and mortality.

Detection method: Word and phrase list

Was this article helpful?
How can we improve it?
Sign in to your account

Get account-specific help by signing in with your G Suite account email address, or learn how to get started with G Suite.