Skip to content

You are viewing documentation for Immuta version 2022.5.

For the latest version, view our documentation for Immuta SaaS or the latest self-hosted version.

Templated Policies in Immuta

When activated by Data Governors, these templated policies are automatically enforced on data sources that have had relevant tags applied to them by users or Sensitive Data Discovery.

This page outlines the types of templated policies users can manage in Immuta. To learn how to activate a templated policy, navigate to the tutorial.

HIPAA De-identification Policy

HIPAA De-identification requires that

  • 18 direct identifiers are removed from data sources.
  • Data Owners do not have actual knowledge that Data Users could re-identify individuals.

The HIPAA De-identification policy is a Global Policy included in Immuta by default. When combined with Sensitive Data Discovery, this policy automatically applies to relevant data sources. However, to fully comply with HIPAA Safe Harbor, Data Owners will need to certify that tags on data sources are accurate; after the policy is applied, multiple warnings indicate that certification is required, including a "Policy Certification Required" label on the data source and on the policy. Additionally, owners will receive a notification to certify the policy.

HIPAA Safe Harbor

Note: The HIPAA De-identification Policy is staged by default and cannot be edited by any user. However, Governors can clone this policy and then edit the clone.

HIPAA De-identification Policy Certification

The Data Owner and Data User certifications serve as official acknowledgements that the users and data comply with HIPAA Safe Harbor:

  • Data Owner Certification: Data Owners certify that all 18 identifiers have been correctly tagged and that they have no knowledge that the information in the data sources could be used by Data Users to identify individuals.
  • Data User Certification: Data Users agree to use the data only for the stated purpose of the project; refrain from sharing that data outside the project; not re-identify or take any steps to re-identify individuals' health information; notify the Project Owner or Governance team in the event that individuals have been identified or could be identified; and refrain from contacting individuals who might be identified.

HIPAA De-identification Policy Expert Determination (Public Preview)

HIPAA Expert Determination allows data scientists to increase utility of datasets while still complying with strict HIPAA regulations that require a "very low" re-identification risk. It does this through a project by allowing the project owner to adjust the k-anonymization noise across multiple columns to gain utility. To learn more about Expert Determination, see Policy Adjustments and HIPAA Expert Determination in Immuta .

California Consumer Privacy Act (CCPA) Policy

The CCPA policy is a Global Policy included in Immuta by default. When combined with Sensitive Data Discovery, this policy automatically applies to relevant data sources.

CCPA sets forth two routes to achieve compliance:

  • businesses processing consumer personal information abide by all applicable restrictions (e.g., purpose restrictions or consumer rights), and/or
  • businesses transform consumer personal information into de-identified or aggregate data so that restrictions, such as consumer rights, become inapplicable.

Under CCPA, de-identification is successfully performed if data “cannot reasonably identify, relate to, describe, be capable of being associated with, or be linked, directly or indirectly, to a particular consumer,” provided that an organization that uses de-identified information

  • implements technical safeguards that prohibit re-identification of the consumer to whom the information may pertain,
  • implements business processes that specifically prohibit re-identification of the information,
  • implements business processes to prevent inadvertent release of de-identified information, and
  • makes no attempt to re-identify the information.

Immuta’s CCPA de-identification policy was created to comply with this definition and consists of 4 main components (each of which addresses at least one prong of CCPA's de-identification test):

  • a self-executing data policy that applies a de-identification technique that serves as a technical safeguard to prohibit re-identification of the consumer.
  • certifications by the Data Owner. These serve as an official acknowledgement that the covered business has initially appropriately labeled consumer information and is not aware that the Data User is in position to re-identify consumers prior to the re-use of the data. This component is crucial to prevent inadvertent release of de-identified information.
  • certifications by the Data User. These serve as official acknowledgements that the Data User is subject to business processes that prohibit re-identification and inadvertent release of de-identified information to third parties.
  • functionalities to enable real-time monitoring and auditing of query-based access to data. These aim to deter and detect attempts to re-identify.

Note: The language used in certifications can be customized to meet specific needs of customers, such as when customers want to use specific language found in data-sharing agreements.

CCPA Policy Conditions

The data policy is made of four rules, as illustrated below.

CCPA Policy

The first rule ensures that access to data can only happen for two types of use cases: those that require access to de-identified data (Re-identification Prohibited.CCPA) and those that require access to identifying data (Use Case Outside De-identification). Data Users are then strictly segmented by use case through attribute-based access control and purpose acknowledgement.

The second rule nulls direct identifiers and undetermined identifiers for Data Users with access to de-identified data.

The third rule generalizes indirect identifiers with k-anonymization so that the re-identifiability probability is always equal to or below 5% for Data Users with access to de-identified data. Note: Immuta has analyzed industry standards and thresholds recommended by statistical methods experts and selected the most restrictive value of 5% for the maximum re-identifiability probability.

The fourth rule applies the first three rules to all data sources containing columns tagged Discovered.Identifier Direct, Discovered.Identifier Indirect, or Discovered.Identifier Undetermined.

Immuta's CCPA policy addresses both both direct and indirect identifiers because robust de-identification requires considering all types of identifying attributes, and the identifiers are masked differently to maximize utility. With this combination of masking techniques, the data re-identification risk (the amount of re-identification possible for each data source) meets CCPA’s de-identification criteria.

Note: The CCPA policy is staged by default and cannot be edited by any user. However, Governors can clone this policy and then edit the clone. However, customers will have to check that after the customization the overall re-identification risk is still acceptable.

New Column Added

When paired with Schema Monitoring, this policy masks newly added columns to data sources until Data Owners review and approve these changes from the Requests tab of their profile page.

New Column Added

Activate a Templated Policy

  1. Click the Policies icon in the left sidebar and navigate to the Data Policies tab.
  2. Click the dropdown menu in the Actions column of one of the templated policies and select Activate. Note: If Data Governors decide to stage an active policy, they select Stage from this dropdown menu.

    Stage Activated Policy

The templated policy is now applied to all relevant data sources.