Sensitive Data Discovery (SDD) API Reference Guide

Workflow

Create a custom classifier.
Create a template containing one or more classifiers.
Search for classifiers or templates.
Apply templates to one or more data sources.
Run SDD on one or more data sources; tags are applied to columns where classifiers were detected.
Update classifiers or templates.
Delete classifiers or templates.

Create a custom classifier

EndpointQuery ParametersPayload ParametersResponse Parameters

Endpoint

Method	Path	Purpose
POST	`sdd/classifier`	Create a classifier.

Query Parameters

None.

Payload Parameters

Attribute	Description	Required
name	`string` Unique, request-friendly classifier name.	Yes
displayName	`string` Unique, human-readable classifier name.	Yes
description	`string` The classifier description.	Yes
type	`string` The type of classifier: `regex`, `dictionary`, `columnNameRegex`, or `builtIn`.	Yes
config	`object` May include `config.minConfidence`, `config.values`, `config.caseSensitive`, `config.regex`, `config.columnNameRegex`, and `config.tags`. *See descriptions below.	Yes
minConfidence*	`number` When the detection confidence is at least this percentage, tags are applied.	Yes
tags*	`array[string]` The name of the tags to apply to the data source.	Yes
regex*	`string` A case-insensitive regular expression to match against column values.	No
columnNameRegex*	`string` A case-insensitive regular expression to match against column names.	No
values*	`array[string]` The list of words to include in the dictionary.	No
caseSensitive*	`boolean` Indicates whether or not `values` are case sensitive. Defaults to `false`.	No

Response Parameters

Attribute	Description
createdBy	`array` Includes details about the user who created the classifier, such as their profile `id`, `name`, and `email`.
name	`string` Unique, request-friendly classifier name.
displayName	`string` Unique, human-readable classifier name.
description	`string` The classifier description.
type	`string` The type of classifier: `regex`, `dictionary`, `columnNameRegex`, or `builtIn`.
config	`object` May include `config.minConfidence`, `config.values`, `config.caseSensitive`, `config.regex`, `config.columnNameRegex`, and `config.tags`. *See descriptions below.
minConfidence	`number` When the detection confidence is at least this percentage, tags are applied.
tags*	`array[string]` The name of the tags to apply to the data source.
columnNameRegex*	`string` A case-insensitive regular expression to optionally match against column names.
regex*	`string` A case-insensitive regular expression to match against column values.
values*	`array[string]` The list of words included in the dictionary.
caseSensitive*	`boolean` Indicates whether or not `values` are case sensitive. Defaults to `false`.
createdAt	`date` When the classifier was created.
updatedAt	`date` When the classifier was last updated.

Request example

The following request creates a custom classifier, which is saved in example-payload.json.

curl \
    --request POST \
    --header "Content-Type: application/json" \
    --header "Authorization: Bearer dea464c07bd07300095caa8" \
    --data @example-payload.json \
    https://your-immuta-url.immuta.com/sdd/classifier

Payload examples

Regex classifier payload

{
  "name": "MY_REGEX_CLASSIFIER",
  "displayName": "My Regex Classifier",
  "description": "A classifier using regex",
  "type": "regex",
  "config": {
    "regex": "^[A-Z][a-z]+",
    "minConfidence": 0.5,
    "tags": ["Discovered.regex-example"]
  }
}

Dictionary classifier payload

{
  "name": "MY_DICTIONARY_CLASSIFIER",
  "displayName": "My Dictionary Classifier",
  "description": "A classifier using dictionary",
  "type": "dictionary",
  "config": {
    "values": ["Bob", "Eve"],
    "caseSensitive": true,
    "minConfidence": 0.6,
    "tags": ["Discovered.dictionary-example", "Discovered.dictionary-identifier-example"]
  }
}

Column name regex classifier payload

{
  "name": "MY_COLUMN_NAME_REGEX_CLASSIFIER",
  "displayName": "My Column Name Regex Classifier",
  "description": "A classifier using column name regex",
  "type": "columnNameRegex",
  "config": {
    "columnNameRegex": "ssn|social ?security",
    "tags": ["Discovered.column-name-regex"]
  }
}

Response example

{
  "createdBy": {
    "id": 1,
    "name": "John",
    "email": "john@example.com"
  },
  "name": "MY_REGEX_CLASSIFIER",
  "displayName": "My Regex Classifier",
  "description": "A classifier using regex",
  "type": "regex",
  "config": {
    "tags": [
      "Discovered.regex-example"
    ],
    "regex": "^[A-Z][a-z]+",
    "minConfidence": 0.5
  },
  "id": 67,
  "createdAt": "2021-10-14T18:48:56.289Z",
  "updatedAt": "2021-10-14T18:48:56.289Z"
}

Create a template

EndpointQuery ParametersPayload ParametersResponse Parameters

Endpoint

Method	Path	Purpose
POST	`sdd/template`	Create a template.

Query Parameters

None.

Payload Parameters

Attribute	Description	Required
name	`string` Unique, request-friendly template name.	Yes
displayName	`string` Unique, human-readable template name.	Yes
description	`string` The template description.	Yes
classifiers	`array` Includes each classifier's `name` and `overrides` for `minConfidence` and `tags`.	Yes
sampleSize	`integer` Override for how many records to sample from the data source.	No

Response Parameters

Attribute	Description
id	`integer` The unique ID of the template.
createdBy	`array` Includes details about the user who created the template, such as their profile `id`, `name`, and `email`.
name	`string` Unique, request-friendly template name.
displayName	`string` Unique, human-readable template name.
description	`string` The template description.
classifiers	`array` Includes details about the classifiers within the template, such as the `name` and `overrides`.
sampleSize	`integer` Optional override of how many records to sample from the data source.
createdAt	`date` When the template was created.
updatedAt	`date` When the template was last updated.

Request example

The following request creates a template that contains 2 classifiers, saved in example-payload.json.

curl \
    --request POST \
    --header "Content-Type: application/json" \
    --header "Authorization: Bearer dea464c07bd07300095caa8" \
    --data @example-payload.json \
    https://your-immuta-url.immuta.com/sdd/template

Payload example

{
  "name": "MY_FIRST_TEMPLATE",
  "displayName": "My First Template",
  "description": "This is the first template I've created.",
  "classifiers": [
    {
      "name": "MY_COLUMN_NAME_REGEX_CLASSIFIER"
    },
    {
      "name": "MY_REGEX_CLASSIFIER"
    }
  ],
  "sampleSize": 100
}

Response example

{
  "name": "MY_FIRST_TEMPLATE",
  "displayName": "My First Template",
  "description": "This is the first template I've created.",
  "sampleSize": 100,
  "createdBy": {
    "id": 1,
    "name": "John",
    "email": "john@example.com"
  },
  "id": 1,
  "createdAt": "2021-10-14T19:12:22.092Z",
  "updatedAt": "2021-10-14T19:12:22.092Z",
  "classifiers": [
    {
      "name": "MY_COLUMN_NAME_REGEX_CLASSIFIER",
      "overrides": {}
    },
    {
      "name": "MY_REGEX_CLASSIFIER",
      "overrides": {}
    }
  ]
}

Search for classifiers or templates

Method	Path	Purpose
GET	`sdd/classifier`	List or search classifiers.
GET	`sdd/template`	List or search templates.
GET	`sdd/classifier/{classifierName}`	View a specific classifier by name.
GET	`sdd/template/{templateName}`	View a specific template by name.
GET	`sdd/template/global`	View the current global SDD template.

List or search for classifiers

EndpointQuery ParametersResponse Parameters

Endpoint

Method	Path	Purpose
GET	`sdd/classifier`	List or search classifiers.

Query Parameters

Attribute	Description	Required
sortField	`string` The field by which to sort the search results: `id`, `name`, `displayName`, `type`, `createdAt`, or `updatedAt`.	No
sortOrder	`string` Denotes whether to sort the results in ascending (`asc`) or descending (`desc`) order. Default is `asc`.	No
offSet	`integer` Use in combination with `limit` to fetch pages.	No
limit	`integer` Limits the number of results displayed per page.	No
type	`array[string]` Searches for classifiers based on classifier type: `regex`, `dictionary`, `builtIn`, or `columnNameRegex`.	No
searchText	`string` A partial, case-insensitive search on name.	No

Response Parameters

Attribute	Description
count	`integer` The number of classifiers found matching the search criteria.
createdBy	`array` Includes details about the user who created the classifier, such as their profile `id`, `name`, and `email`.
name	`string` Unique, request-friendly classifier name.
displayName	`string` Unique, human-readable classifier name.
description	`string` The classifier description.
type	`string` The type of classifier: `regex`, `dictionary`, `columnNameRegex`, or `builtIn`.
config	`object` May include `config.minConfidence`, `config.values`, `config.caseSensitive`, `config.regex`, `config.columnNameRegex`, and `config.tags`. *See descriptions below.
minConfidence	`number` When the detection confidence is at least this percentage, tags are applied.
tags*	`array[string]` The name of the tags to apply to the data source.
columnNameRegex*	`string` A case-insensitive regular expression to optionally match against column names.
regex*	`string` A case-insensitive regular expression to match against column values.
values*	`array[string]` The list of words included in the dictionary.
caseSensitive*	`boolean` Indicates whether or not `values` are case sensitive. Defaults to `false`.
createdAt	`date` When the classifier was created.
updatedAt	`date` When the classifier was last updated.

Request example

The following request lists 5 classifiers.

curl \
    --request GET \
    --header "Content-Type: application/json" \
    --header "Authorization: Bearer dea464c07bd07300095caa8" \
    https://your-immuta-url.immuta.com/sdd/classifier?sortField=name&sortOrder=asc&limit=5

Response example

{
  "count": 67,
  "hits": [
    {
      "createdBy": {
        "id": 21,
        "name": "Immuta System Account",
        "email": "immuta_system@immuta.com"
      },
      "name": "AGE",
      "displayName": "Age",
      "description": "Detects numeric strings between 10 and 199, provided the column header contains text such as `age`, `year`, `years`, `yr`, or `yrs`.",
      "type": "builtIn",
      "config": {
        "minConfidence": 0.7,
        "tags": [
          "Discovered.PII",
          "Discovered.Identifier Indirect",
          "Discovered.PHI",
          "Discovered.Entity.Age"
        ],
        "conditionalTags": {}
      },
      "id": 3,
      "createdAt": "2021-10-28T07:34:58.761Z",
      "updatedAt": "2021-10-28T07:34:58.761Z"
    },
    {
      "createdBy": {
        "id": 21,
        "name": "Immuta System Account",
        "email": "immuta_system@immuta.com"
      },
      "name": "ARGENTINA_DNI_NUMBER",
      "displayName": "Argentina DNI Number",
      "description": "Detects strings consistent with Argentina National Identity (DNI) Number.  Requires an eight digit number with optional periods between the second and third and fifth and sixth digit.",
      "type": "builtIn",
      "config": {
        "minConfidence": 0.7,
        "tags": [
          "Discovered.PII",
          "Discovered.Identifier Direct",
          "Discovered.Country.Argentina",
          "Discovered.PHI",
          "Discovered.Entity.DNI Number"
        ],
        "conditionalTags": {}
      },
      "id": 4,
      "createdAt": "2021-10-28T07:34:58.769Z",
      "updatedAt": "2021-10-28T07:34:58.769Z"
    },
    {
      "createdBy": {
        "id": 21,
        "name": "Immuta System Account",
        "email": "immuta_system@immuta.com"
      },
      "name": "AUSTRALIA_MEDICARE_NUMBER",
      "displayName": "Australia Medicare Number",
      "description": "Detects numeric strings consistent with Australian Medicare Number.  Requires a ten or eleven digit number.  The starting digit must be between 2 and 6, inclusive.  Optional spaces can be placed between the fourth and fifth and ninth and tenth digit.  Optional 11th separated by a `/` can be present.  A checksum is required.",
      "type": "builtIn",
      "config": {
        "minConfidence": 0.7,
        "tags": [
          "Discovered.PII",
          "Discovered.Identifier Direct",
          "Discovered.Country.Australia",
          "Discovered.PHI",
          "Discovered.Entity.Medicare Number"
        ],
        "conditionalTags": {}
      },
      "id": 5,
      "createdAt": "2021-10-28T07:34:58.779Z",
      "updatedAt": "2021-10-28T07:34:58.779Z"
    },
    {
      "createdBy": {
        "id": 21,
        "name": "Immuta System Account",
        "email": "immuta_system@immuta.com"
      },
      "name": "AUSTRALIA_PASSPORT",
      "displayName": "Australia Passport",
      "description": "Detects strings consistent with Australian Passport number.  A 8 or 9 character string is required, with a starting upper case character (N, E, D, F, A, C, U, X) or a two character starting character (P followed by A, B, C, D, E, F, U, W, X, or Z) followed by seven digits",
      "type": "builtIn",
      "config": {
        "minConfidence": 0.7,
        "tags": [
          "Discovered.PII",
          "Discovered.Identifier Direct",
          "Discovered.Country.Australia",
          "Discovered.PHI",
          "Discovered.Entity.Passport"
        ],
        "conditionalTags": {}
      },
      "id": 26,
      "createdAt": "2021-10-28T07:34:59.010Z",
      "updatedAt": "2021-10-28T07:34:59.010Z"
    },
    {
      "createdBy": {
        "id": 21,
        "name": "Immuta System Account",
        "email": "immuta_system@immuta.com"
      },
      "name": "AUSTRALIA_TAX_FILE_NUMBER",
      "displayName": "Australia Tax File Number",
      "description": "Detects strings consistent with Australia Tax File Number.  Requires a nine digit number with optional spaces between the third and fourth and sixth and seventh digits.  A checksum is also required",
      "type": "builtIn",
      "config": {
        "minConfidence": 0.7,
        "tags": [
          "Discovered.PII",
          "Discovered.Identifier Direct",
          "Discovered.Country.Australia",
          "Discovered.PHI",
          "Discovered.Entity.Tax File Number"
        ],
        "conditionalTags": {}
      },
      "id": 6,
      "createdAt": "2021-10-28T07:34:58.789Z",
      "updatedAt": "2021-10-28T07:34:58.789Z"
    }
  ]
}

List or search for templates

EndpointQuery ParametersResponse Parameters

Endpoint

Method	Path	Purpose
GET	`sdd/template`	List or search templates.

Query Parameters

Attribute	Description	Required
sortField	`string` The field by which to sort the search results: `id`, `name`, `displayName`, `type`, `createdAt`, or `updatedAt`.	No
sortOrder	`string` Denotes whether to sort the results in ascending (`asc`) or descending (`desc`) order. Default is `asc`.	No
offSet	`integer` Use in combination with `limit` to fetch pages.	No
limit	`integer` Limits the number of results displayed per page.	No
classifiers	`array[string]` Filters template results to those containing the specified classifiers.	No
searchText	`string` A partial, case-insensitive search on the template name.	No

Response Parameters

Attribute	Description
count	`integer` The number of templates found matching the search criteria.
id	`integer` The unique ID of the template.
createdBy	`array` Includes details about the user who created the template, such as their profile `id`, `name`, and `email`.
name	`string` Unique, request-friendly template name.
displayName	`string` Unique, human-readable template name.
description	`string` The template description.
classifiers	`array` Includes details about the classifiers within the template, such as the `name` and `overrides`.
sampleSize	`integer` Optional override of how many records to sample from the data source.
createdAt	`date` When the template was created.
updatedAt	`date` When the template was last updated.

Request example

The following request lists all custom templates.

curl \
    --request GET \
    --header "Content-Type: application/json" \
    --header "Authorization: Bearer dea464c07bd07300095caa8" \
    https://your-immuta-url.immuta.com/sdd/template

Response example

{
  "count": 1,
  "hits": [
    {
      "name": "MY_FIRST_TEMPLATE",
      "displayName": "My First Template",
      "description": "This is the first template I've created.",
      "sampleSize": 100,
      "createdBy": {
        "id": 1,
        "name": "John",
        "email": "john@example.com"
      },
      "id": 1,
      "createdAt": "2021-10-14T19:12:22.092Z",
      "updatedAt": "2021-10-14T19:12:22.092Z",
      "classifiers": [
        {
          "name": "MY_COLUMN_NAME_REGEX_CLASSIFIER",
          "overrides": {}
        },
        {
          "name": "MY_REGEX_CLASSIFIER",
          "overrides": {}
        }
      ]
    }
  ]
}

View a classifier by name

EndpointQuery ParametersResponse Parameters

Endpoint

Method	Path	Purpose
GET	`sdd/classifier/{classifierName}`	Get a classifier by name.

Query Parameters

Attribute	Description	Required
classifierName	`string` The name of the classifier.	Yes

Response Parameters

Attribute	Description
id	`integer` The unique ID of the classifier.
createdBy	`array` Includes details about the user who created the classifier, such as their profile `id`, `name`, and `email`.
name	`string` Unique, request-friendly template name.
displayName	`string` Unique, human-readable classifier name.
description	`string` The classifier description.
type	`string` The type of classifier: `regex`, `dictionary`, `columnNameRegex`, or `builtIn`.
config	`object` May include `config.minConfidence`, `config.values`, `config.caseSensitive`, `config.regex`, `config.columnNameRegex`, and `config.tags`. *See descriptions below.
minConfidence	`number` When the detection confidence is at least this percentage, tags are applied.
tags*	`array[string]` The name of the tags to apply to the data source.
columnNameRegex*	`string` A case-insensitive regular expression to optionally match against column names.
regex*	`string` A case-insensitive regular expression to match against column values.
values*	`array[string]` The list of words included in the dictionary.
caseSensitive*	`boolean` Indicates whether or not `values` are case sensitive. Defaults to `false`.
createdAt	`date` When the classifier was created.
updatedAt	`date` When the classifier was last updated.

Request example

This request gets the classifier named MY_REGEX_CLASSIFIER.

curl \
    --request GET \
    --header "Content-Type: application/json" \
    --header "Authorization: Bearer dea464c07bd07300095caa8" \
    https://your-immuta-url.immuta.com/sdd/classifier/MY_REGEX_CLASSIFIER

Response example

{
  "createdBy": {
    "id": 1,
    "name": "John",
    "email": "john@example.com"
  },
  "name": "MY_REGEX_CLASSIFIER",
  "displayName": "My Regex Classifier",
  "description": "A classifier using regex",
  "type": "regex",
  "config": {
    "tags": [
      "Discovered.regex-example"
    ],
    "regex": "^[A-Z][a-z]+",
    "minConfidence": 0.5
  },
  "id": 67,
  "createdAt": "2021-10-18T16:48:18.819Z",
  "updatedAt": "2021-10-18T16:48:18.819Z"
}

View a template by name

EndpointQuery ParametersResponse Parameters

Endpoint

Method	Path	Purpose
GET	`sdd/template/{templateName}`	Get a template by name.

Query Parameters

Attribute	Description	Required
templateName	`string` The name of the template.	Yes

Response Parameters

Attribute	Description
id	`integer` The unique ID of the template.
createdBy	`array` Includes details about the user who created the template, such as their profile `id`, `name`, and `email`.
name	`string` Unique, request-friendly template name.
displayName	`string` Unique, human-readable template name.
description	`string` The template description.
classifiers	`array` Includes details about the classifiers within the template, such as the `name` and `overrides`.
sampleSize	`integer` Optional override of how many records to sample from the data source.
createdAt	`date` When the template was created.
updatedAt	`date` When the template was last updated.

Request example

This request gets the template named MY_FIRST_TEMPLATE.

curl \
    --request GET \
    --header "Content-Type: application/json" \
    --header "Authorization: Bearer dea464c07bd07300095caa8" \
    https://your-immuta-url.immuta.com/sdd/template/MY_FIRST_TEMPLATE

Response example

{
  "name": "MY_FIRST_TEMPLATE",
  "displayName": "My First Template",
  "description": "This is the first template I've created.",
  "sampleSize": 100,
  "createdBy": {
    "id": 1,
    "name": "John",
    "email": "john@immuta.com"
  },
  "id": 1,
  "createdAt": "2021-10-18T16:54:24.920Z",
  "updatedAt": "2021-10-18T16:54:24.920Z",
  "classifiers": [
    {
      "name": "MY_DICTIONARY_CLASSIFIER",
      "overrides": {}
    },
    {
      "name": "MY_REGEX_CLASSIFIER",
      "overrides": {}
    }
  ]
}

View the current template

EndpointQuery ParametersResponse Parameters

Endpoint

Method	Path	Purpose
GET	`sdd/template/global`	View the current global SDD template.

Query Parameters

None.

Response Parameters

Attribute	Description
id	`integer` The unique ID of the template.
name	`string` Unique, request-friendly template name.
displayName	`string` Unique, human-readable template name.
description	`string` The template description.
classifiers	`array` Includes details about the classifiers within the template, such as the `name` and `overrides`.
sampleSize	`integer` Optional override of how many records to sample from the data source.
createdBy	`array` Includes details about the user who created the template, such as their profile `id`, `name`, and `email`.
createdAt	`date` When the template was created.
updatedAt	`date` When the template was last updated.

Request example

This request gets the current global SDD template information.

curl -X 'GET' \
  'https://demo.immuta.com/sdd/template/global' \
  -H 'accept: application/json' \
  -H 'Authorization: Bearer 9ba76f3c64c345ad817fa467d7110556'

Response example

{
  "name": "MY_FIRST_TEMPLATE",
  "displayName": "My First Template",
  "description": "This is the first template I've created.",
  "sampleSize": 100,
  "createdBy": {
    "id": 2,
    "name": "Jane Doe",
    "email": "jane.doe@immuta.com"
  },
  "id": 1,
  "createdAt": "2022-08-10T20:35:43.252Z",
  "updatedAt": "2022-08-10T20:35:43.252Z",
  "classifiers": [
    {
      "name": "AGE",
      "overrides": {}
    },
    {
      "name": "ETHNIC_GROUP",
      "overrides": {}
    }
  ]
}

Apply templates to data sources

EndpointQuery ParametersPayload ParametersResponse Parameters

Endpoint

Method	Path	Purpose
PUT	`sdd/template/apply`	Apply a template to a set of data sources.

Query Parameters

None.

Payload Parameters

Attribute	Description	Required
template	`string` The name of the template to apply to the data sources; `null` to clear current template.	Yes
sources	`string` The name of the data sources to apply the template to.	Yes

Response Parameters

Attribute	Description
success	`boolean` When `true`, the request was successful.

Request example

This request applies the MY_FIRST_TEMPLATE template to the Public Case data source.

curl \
    --request PUT \
    --header "Content-Type: application/json" \
    --header "Authorization: Bearer dea464c07bd07300095caa8" \
    --data @example-payload.json \
    https://your-immuta-url.immuta.com/sdd/template/apply

Payload example

{
  "template": "MY_FIRST_TEMPLATE",
  "sources": [
    "Public Case"
  ]
}

Response example

{
  "success": true
}

Run SDD on data sources

EndpointQuery ParametersPayload ParametersResponse Parameters

Endpoint

Method	Path	Purpose
POST	`sdd/run`	Run SDD on specified data sources.

Query Parameters

None.

Payload Parameters

Attribute	Description	Required
sources	`string` The name of the data sources to apply the template to.	Yes
all	`boolean` If `true`, SDD will run on all Immuta data sources.	No
wait	`integer` The number of seconds to wait for the SDD jobs to finish. The value `-1` will wait until the jobs complete. Default is `-1`.	No
dryRun	`boolean` When `true`, SDD will not update the tags on the data source(s) and will just return what tags would have been applied or removed. Default is `false`.	No
template	`string` If passed, Immuta will run SDD with this template instead of the applied template on the data source(s). Passing `template` when `dryRun` is `false` will cause an error.	No

Response Parameters

Attribute	Description
id	`string` A job universally unique identifier.
state	`string` The job state: `created`, `retry`, `active`, `completed`, `expired`, `cancelled`, or `failed`.
output	`array[string]` Information about the tags applied on the data source, including `diff` (`added` and `removed` tags) and the current state of `allTags` on all columns in the data sources.

Request example: Run SDD on a single data source

This request runs SDD on the data source Public Case.

curl \
    --request POST \
    --header "Content-Type: application/json" \
    --header "Authorization: Bearer dea464c07bd07300095caa8" \
    --data @example-payload.json \
    https://your-immuta-url.immuta.com/sdd/run

Payload example

{
  "sources": [
    "Insurance Data"
  ]
}

Response example

{
  "Insurance Data": {
    "id": "d2edc1d0-328c-11ec-9d5a-6793988ccf95",
    "state": "completed",
    "output": {
      "diff": {
        "addedTags": {
          "ssn": [
            "Discovered.PII"
          ],
          "email": [
            "Discovered.PII"
          ]
        },
        "removedTags": {
          "ssn": [
            "Discovered.Country.US"
          ]
        }
      },
      "sddTagResult": {
        "ssn": [
          "Discovered.Entity.Social Security Number",
          "Discovered.Identifier Direct",
          "Discovered.PHI",
          "Discovered.PII"
        ],
        "email": [
          "Discovered.Entity.Electronic Mail Address",
          "Discovered.Identifier Direct",
          "Discovered.PHI",
          "Discovered.PII"
        ]
      }
    }
  }
}

Request example: Run SDD on all data sources

This request runs SDD on all your data sources.

curl \
    --request POST \
    --header "Content-Type: application/json" \
    --header "Authorization: Bearer dea464c07bd07300095caa8" \
    --data @example-payload.json \
    https://your-immuta-url.immuta.com/sdd/run

Payload example

{
  "all": true
}

Response example

{
  "Insurance Data": {
    "id": "d2edc1d0-328c-11ec-9d5a-6793988ccf95",
    "state": "completed",
    "output": {
      "diff": {
        "addedTags": {
          "ssn": [
            "Discovered.PII"
          ],
          "email": [
            "Discovered.PII"
          ]
        },
        "removedTags": {
          "ssn": [
            "Discovered.Country.US"
          ]
        }
      },
      "sddTagResult": {
        "ssn": [
          "Discovered.Entity.Social Security Number",
          "Discovered.Identifier Direct",
          "Discovered.PHI",
          "Discovered.PII"
        ],
        "email": [
          "Discovered.Entity.Electronic Mail Address",
          "Discovered.Identifier Direct",
          "Discovered.PHI",
          "Discovered.PII"
        ]
      }
    }
  }
  "Finance Data": {
    "id": "d2edc1d0-328c-11ec-9d5a-695e896d59s",
    "state": "completed",
    "output": {
      "diff": {
        "addedTags": {
          "ssn": [
            "Discovered.PII"
          ],
          "email": [
            "Discovered.PII"
          ]
        },
        "removedTags": {
          "ssn": [
            "Discovered.Country.US"
          ]
        }
      },
      "sddTagResult": {
        "ssn": [
          "Discovered.Entity.Social Security Number",
          "Discovered.Identifier Direct",
          "Discovered.PHI",
          "Discovered.PII"
        ],
        "email": [
          "Discovered.Entity.Electronic Mail Address",
          "Discovered.Identifier Direct",
          "Discovered.PHI",
          "Discovered.PII"
        ]
      }
    }
  }
}

Update classifiers or templates

Method	Path	Purpose
PUT	`/sdd/classifier/{classifierName}`	Update a classifier. Partial updates are not supported.
POST	`sdd/classifier/template/{templateName}/clone`	Clone a template.
PUT	`/sdd/template/{templateName}`	Update a template.

Update a classifier

EndpointQuery ParametersPayload ParametersResponse Parameters

Endpoint

Method	Path	Purpose
PUT	`sdd/classifier/{classifierName}`	Update a classifier. Partial updates are not supported.

Query Parameters

Attribute	Description	Required
classifierName	`string` The name of the classifier to update.	Yes

Payload Parameters

Attribute	Description	Required
name	`string` Unique, request-friendly classifier name.	Yes
displayName	`string` Unique, human-readable classifier name.	Yes
description	`string` The classifier description.	Yes
type	`string` The type of classifier: `regex`, `dictionary`, `columnNameRegex`, or `builtIn`.	Yes
config	`object` May include `config.minConfidence`, `config.values`, `config.caseSensitive`, `config.regex`, `config.columnNameRegex`, and `config.tags`. *See descriptions below.	Yes
minConfidence*	`number` When the detection confidence is at least this percentage, tags are applied.	Yes
tags*	`array[string]` The name of the tags to apply to the data source.	Yes
regex*	`string` A case-insensitive regular expression to match against column values.	No
columnNameRegex*	`string` A case-insensitive regular expression to match against column names.	No
values*	`array[string]` The list of words to include in the dictionary.	No
caseSensitive*	`boolean` Indicates whether or not `values` are case sensitive. Defaults to `false`.	No

Response Parameters

Attribute	Description
createdBy	`array` Includes details about the user who created the classifier, such as their profile `id`, `name`, and `email`.
name	`string` Unique, request-friendly classifier name.
displayName	`string` Unique, human-readable classifier name.
description	`string` The classifier description.
type	`string` The type of classifier: `regex`, `dictionary`, `columnNameRegex`, or `builtIn`.
config	`object` May include `config.minConfidence`, `config.values`, `config.caseSensitive`, `config.regex`, `config.columnNameRegex`, and `config.tags`. *See descriptions below.
minConfidence	`number` When the detection confidence is at least this percentage, tags are applied.
tags*	`array[string]` The name of the tags to apply to the data source.
columnNameRegex*	`string` A case-insensitive regular expression to optionally match against column names.
regex*	`string` A case-insensitive regular expression to match against column values.
values*	`array[string]` The list of words included in the dictionary.
caseSensitive*	`boolean` Indicates whether or not `values` are case sensitive. Defaults to `false`.
createdAt	`date` When the classifier was created.
updatedAt	`date` When the classifier was last updated.

Request example

The following request updates the name and description of the MY_REGEX_CLASSIFIER classifier.

curl \
    --request PUT \
    --header "Content-Type: application/json" \
    --header "Authorization: Bearer dea464c07bd07300095caa8" \
    --data @example-payload.json \
    https://your-immuta-url.immuta.com/sdd/classifier/MY_REGEX_CLASSIFIER

Payload example

{
  "name": "REGULAR_EXPRESSIONS",
  "displayName": "Regular Expressions",
  "description": "This classifier uses regular expressions",
  "type": "regex",
  "config": {
    "regex": "^[A-Z][a-z]+",
    "minConfidence": 0.5,
    "tags": ["Discovered.regex-example"]
  }
}

Response example

{
  "createdBy": {
    "id": 1,
    "name": "John",
    "email": "john@example.com"
  },
  "name": "REGULAR_EXPRESSIONS",
  "displayName": "Regular Expressions",
  "description": "This classifier uses regular expressions",
  "type": "regex",
  "config": {
    "tags": [
      "Discovered.regex-example"
    ],
    "regex": "^[A-Z][a-z]+",
    "minConfidence": 0.5
  },
  "id": 67,
  "createdAt": "2021-10-14T18:48:56.289Z",
  "updatedAt": "2021-10-19T12:48:56.289Z"
}

Clone a template

EndpointQuery ParametersPayload ParametersResponse Parameters

Endpoint

Method	Path	Purpose
POST	`sdd/template/{templateName}/clone`	Clone a template.

Query Parameters

Attribute	Description	Required
templateName	`string` The name of the template to clone.	Yes

Payload Parameters

Attribute	Description	Required
name	`string` Unique, request-friendly template name for the cloned template.	Yes
displayName	`string` Unique, human-readable template name for the cloned template.	Yes
description	`string` The cloned template description.	No

Response Parameters

Attribute	Description
id	`integer` The unique ID of the template.
createdBy	`array` Includes details about the user who created the template, such as their profile `id`, `name`, and `email`.
name	`string` Unique, request-friendly template name.
displayName	`string` Unique, human-readable template name.
description	`string` The template description.
classifiers	`array` Includes details about the classifiers within the template, such as the `name` and `overrides`.
sampleSize	`integer` Optional override of how many records to sample from the data source.
createdAt	`date` When the template was created.
updatedAt	`date` When the template was last updated.

Request example

This request clones the MY_FIRST_TEMPLATE template.

curl \
    --request POST \
    --header "Content-Type: application/json" \
    --header "Authorization: Bearer dea464c07bd07300095caa8" \
    --data @example-payload.json \
    https://your-immuta-url.immuta.com/sdd/template/MY_FIRST_TEMPLATE/clone

Payload example

{
  "name": "CLONE_OF_FIRST_TEMPLATE",
  "displayName": "Clone of My First Template",
  "description": "This is a clone of my first template."
}

Response example

{
  "name": "CLONE_OF_FIRST_TEMPLATE",
  "displayName": "Clone of My First Template",
  "description": "This is a clone of my first template.",
  "sampleSize": 100,
  "createdBy": {
    "id": 1,
    "name": "John",
    "email": "john@example.com"
  },
  "id": 4,
  "createdAt": "2021-10-19T16:21:17.660Z",
  "updatedAt": "2021-10-19T16:21:17.660Z",
  "classifiers": [
    {
      "name": "MY_COLUMN_NAME_REGEX_CLASSIFIER",
      "overrides": {}
    },
    {
      "name": "MY_REGEX_CLASSIFIER",
      "overrides": {}
    }
  ]
}

Update a template

EndpointQuery ParametersPayload ParametersResponse Parameters

Endpoint

Method	Path	Purpose
PUT	`sdd/template/{templateName}`	Update a template.

Query Parameters

Attribute	Description	Required
templateName	`string` The name of the template to update.	Yes

Payload Parameters

Attribute	Description	Required
name	`string` Unique, request-friendly template name.	Yes
displayName	`string` Unique, human-readable template name.	Yes
description	`string` The template description.	Yes
classifiers	`array` Includes each classifier's `name` and `overrides` for `minConfidence` and `tags`.	Yes
sampleSize	`integer` Override for how many records to sample from the data source.	No

Response Parameters

Attribute	Description
id	`integer` The unique ID of the template.
createdBy	`array` Includes details about the user who created the template, such as their profile `id`, `name`, and `email`.
name	`string` Unique, request-friendly template name.
displayName	`string` Unique, human-readable template name.
description	`string` The template description.
classifiers	`array` Includes details about the classifiers within the template, such as the `name` and `overrides`.
sampleSize	`integer` Optional override of how many records to sample from the data source.
createdAt	`date` When the template was created.
updatedAt	`date` When the template was last updated.

Request example

The following request updates the name of, description of, and classifier in the MY_FIRST_TEMPLATE template.

curl \
    --request PUT \
    --header "Content-Type: application/json" \
    --header "Authorization: Bearer dea464c07bd07300095caa8" \
    --data @example-payload.json \
    https://your-immuta-url.immuta.com/sdd/template/MY_FIRST_TEMPLATE

Payload example

{
  "name": "HEALTH_DATA",
  "displayName": "Health Data",
  "description": "This template uses the column regex and regex classifiers.",
  "classifiers": [
    {
      "name": "MY_COLUMN_NAME_REGEX_CLASSIFIER"
    },
    {
      "name": "REGULAR_EXPRESSION"
    }
  ],
  "sampleSize": 100
}

Response example

{
  "name": "HEALTH_DATA",
  "displayName": "Health Data",
  "description": "This template uses the column regex and regex classifiers.",
  "sampleSize": 100,
  "createdBy": {
    "id": 1,
    "name": "John",
    "email": "john@example.com"
  },
  "id": 1,
  "createdAt": "2021-10-14T19:12:22.092Z",
  "updatedAt": "2021-10-20T19:12:22.092Z",
  "classifiers": [
    {
      "name": "MY_COLUMN_NAME_REGEX_CLASSIFIER",
      "overrides": {}
    },
    {
      "name": "REGULAR_EXPRESSION",
      "overrides": {}
    }
  ]
}

Delete classifiers or templates

Method	Path	Purpose
DELETE	`/sdd/classifier/{classifierName}`	Delete a classifier.
DELETE	`/sdd/template/{templateName}`	Delete a template.

Delete a classifier

EndpointQuery ParametersResponse Parameters

Endpoint

Method	Path	Purpose
DELETE	`sdd/classifier/{classifierName}`	Delete a classifier.

Query Parameters

Attribute	Description	Required
classifierName	`string` The name of the classifier to delete.	Yes

Response Parameters

Attribute	Description
createdBy	`array` Includes details about the user who created the classifier, such as their profile `id`, `name`, and `email`.
name	`string` Unique, request-friendly classifier name.
displayName	`string` Unique, human-readable classifier name.
description	`string` The classifier description.
type	`string` The type of classifier: `regex`, `dictionary`, `columnNameRegex`, or `builtIn`.
config	`object` May include `config.minConfidence`, `config.values`, `config.caseSensitive`, `config.regex`, `config.columnNameRegex`, and `config.tags`. *See descriptions below.
minConfidence	`number` When the detection confidence is at least this percentage, tags are applied.
tags*	`array[string]` The name of the tags to apply to the data source.
columnNameRegex*	`string` A case-insensitive regular expression to optionally match against column names.
regex*	`string` A case-insensitive regular expression to match against column values.
values*	`array[string]` The list of words included in the dictionary.
caseSensitive*	`boolean` Indicates whether or not `values` are case sensitive. Defaults to `false`.
createdAt	`date` When the classifier was created.
updatedAt	`date` When the classifier was last updated.

Request example

The following request deletes the REGULAR_EXPRESSION classifier.

curl \
    --request DELETE \
    --header "Content-Type: application/json" \
    --header "Authorization: Bearer dea464c07bd07300095caa8" \
    https://your-immuta-url.immuta.com/sdd/classifier/REGULAR_EXPRESSION

Response example

{
  "createdBy": {
    "id": 1,
    "name": "John",
    "email": "john@example.com"
  },
  "name": "REGULAR_EXPRESSION",
  "displayName": "Regular Expression",
  "description": "This classifier uses regular expression",
  "type": "regex",
  "config": {
    "tags": [
      "Discovered.regex-example"
    ],
    "regex": "^[A-Z][a-z]+",
    "minConfidence": 0.5
  },
  "id": 67,
  "createdAt": "2021-10-19T15:54:28.695Z",
  "updatedAt": "2021-10-19T16:00:02.329Z"
}

Delete a template

EndpointQuery ParametersResponse Parameters

Endpoint

Method	Path	Purpose
DELETE	`sdd/template/{templateName}`	Delete a template.

Query Parameters

Attribute	Description	Required
templateName	`string` The name of the template to delete.	Yes

Response Parameters

Attribute	Description
id	`integer` The unique ID of the template.
createdBy	`array` Includes details about the user who created the template, such as their profile `id`, `name`, and `email`.
name	`string` Unique, request-friendly template name.
displayName	`string` Unique, human-readable template name.
description	`string` The template description.
classifiers	`array` Includes details about the classifiers within the template, such as the `name` and `overrides`.
sampleSize	`integer` Optional override of how many records to sample from the data source.
createdAt	`date` When the template was created.
updatedAt	`date` When the template was last updated.

Request example

The following request deletes the HEALTH_DATA template.

curl \
    --request DELETE \
    --header "Content-Type: application/json" \
    --header "Authorization: Bearer dea464c07bd07300095caa8" \
    https://your-immuta-url.immuta.com/sdd/template/HEALTH_DATA

Response example

{
  "name": "HEALTH_DATA",
  "displayName": "Health Data",
  "description": "This is a template for health data.",
  "sampleSize": 100,
  "createdBy": {
    "id": 1,
    "name": "John",
    "email": "john@immuta.com"
  },
  "id": 1,
  "createdAt": "2021-10-19T16:07:39.356Z",
  "updatedAt": "2021-10-19T16:07:39.356Z",
  "classifiers": [
    {
      "name": "MY_COLUMN_NAME_REGEX_CLASSIFIER",
      "overrides": {}
    }
  ]
}