Skip to content

You are viewing documentation for Immuta version 2022.5.

For the latest version, view our documentation for Immuta SaaS or the latest self-hosted version.

Sensitive Data Discovery

Audience: Data Owners and Data Governors

Content Summary: To help users identify sensitive data and to enhance the power of Global Policies, Immuta offers Sensitive Data Discovery (SDD), which automatically identifies and tags columns that contain sensitive data.

Overview

When enabled on the App Settings page, this feature automatically identifies and tags columns that contain sensitive data (PII, PHI, etc.) when the data source is created; this detection is based on a small sample of underlying data, which remains in the users' network.

Once sensitive data is identified, SDD applies relevant “Discovered” tags to those columns.

Discovered tags

Discovered

  • Country

    List of Specific Countries

    Country

    • Argentina
    • Australia
    • Belgium
    • Brazil
    • Canada
    • Chile
    • China
    • Colombia
    • Denmark
    • Finland
    • France
    • Germany
    • Hong Kong
    • India
    • Indonesia
    • Japan
    • Korea
    • Mexico
    • Netherlands
    • Norway
    • Paraguay
    • Peru
    • Poland
    • Singapore
    • Spain
    • Sweden
    • Taiwan
    • Thailand
    • Turkey
    • UK
    • Uruguay
    • US
    • Venezuela
  • Entity

    List of Entities

    Entity

    • Aadhaar Individual
    • Adoption Taxpayer ID Number
    • Age
    • Bank Account
    • Bankers CUSIP ID
    • Bank Routing MICR
    • British Columbia Health Network Number
    • BSN Number
    • CDC Number
    • CDI Number
    • CIC Number
    • CNI Number
    • CPF Number
    • CPR Number
    • Credit Card Number
    • CURP Number
    • Date
    • Date of Birth
    • DEA Number
    • DNI Number
    • Domain Name
    • Drivers License Number
    • Electronic Mail Address
    • Employer ID Number
    • Ethnic Group
    • FDA Code
    • Gender
    • GST Individual
    • Healthcare NPI
    • IBAN Code
    • ICD10 Code
    • ICD9 Code
    • Identity Card Number
    • ID Number
    • IMEI
    • Individual Number
    • Individual Taxpayer ID Number
    • IP Address
    • Location
    • MAC Address
    • MAC Address Local
    • Medicare Number
    • National Health Service Number
    • National ID Card Number
    • National ID Number
    • National Insurance Number
    • National Registration ID Number
    • NIE Number
    • NIF Number
    • NIK Number
    • NI Number
    • NIR
    • Ontario Health Insurance Number
    • PAN Individual
    • Passport
    • Person Name
    • PESEL Number
    • Postal Code
    • Preparer Taxpayer ID Number
    • Quebec Health Insurance Number
    • Resident ID Number
    • RRN
    • Social Insurance Number
    • Social Security Number
    • State
    • Swift Code
    • Tax File Number
    • Taxpayer ID Number
    • Taxpayer Reference
    • Telephone Number
    • Tollfree Telephone Number
    • URL
    • Vehicle Identifier or Serial Number
  • Identifier Direct

  • Identifier Indirect

  • Identifier Undetermined

  • PCI

  • PHI

  • PII

Immuta is pre-configured with a set of these tags so that they can be used to write Global Policies before data sources even exist. Consequently, sensitive data is tagged and appropriate policies are enforced immediately upon data source creation.

Only Application Admins have the option to enable Sensitive Data Discovery on the App Settings page. However, users can disable auto-tagging on a data-source-by-data-source basis, and Governors can disable any unwanted “Discovered” tags in the Immuta application to prevent them from being used and auto-detected in the future.

Configure Sensitive Data Discovery (Public Preview)

Users can configure SDD to customize how sensitive data is detected and what tags are applied to that data. For details, navigate to the Configure Sensitive Data Discovery page.

Considerations

  • SDD does not run on data sources with over 1600 columns.
  • Deleting the built-in Discovered tags is not recommended: If you do delete built-in Discovered tags and use SDD, when the identifier is detected, the column will not be tagged. Tags can be disabled on a column-by-column basis from the data dictionary, or SDD can be turned off on a data-source-by-data-source basis when creating a data source.