Azure Blob Storage Data Source API Reference Guide
The azureblob
endpoint allows you to connect and manage Azure Blob Storage data sources in Immuta.
Note
Additional fields may be included in some responses you receive; however, these attributes are for internal purposes and are therefore undocumented.
Azure Blob workflow
Create a data source
Endpoint
Method | Path | Purpose |
---|---|---|
POST | /azureblob/handler |
Save the provided connection for an Azure Blob Storage data source. |
Query Parameters
None.
Payload Parameters
Attribute | Description | Required |
---|---|---|
private | boolean When false , the data source will be publicly available in the Immuta UI. |
Yes |
blobHandler | array[object] A list of full URLs providing the locations of all blob store handlers to use with this data source. |
Yes |
blobHandlerType | string Describes the type of underlying blob handler that will be used with this data source (e.g., MS SQL ). |
Yes |
recordFormat | string The data format of blobs in the data source, such as json , xml , html , or jpeg . |
Yes |
type | string The type of data source: ingested (metadata will exist in Immuta) or queryable (metadata is dynamically queried). |
Yes |
name | string The name of the data source. It must be unique within the Immuta instance. |
Yes |
sqlTableName | string A string that represents this data source's table in the Query Engine. |
Yes |
organization | string The organization that owns the data source. |
Yes |
category | string The category of the data source. |
No |
description | string The description of the data source. |
No |
owner | array[object] Users and groups that should be added as owners to this data source. Profiles must be a list of profile IDs and groups must be a list of group IDs: { "profiles": [3, 5], "groups": [4, 1999] } . |
No |
expert | array[object] Users and groups that should be added as expert users to this data source. Profiles must be a list of profile IDs and groups must be a list of group IDs: { "profiles": [87, 199], "groups": [324] } . |
No |
ingest | array[object] Users and groups that should be added as ingest users to this data source. Profiles must be a list of profile IDs and groups must be a list of group IDs: { "profiles": [34, 23], "groups": [32] } . |
No |
hasExamples | boolean When true , the data source contains examples. |
No |
Response Parameters
Attribute | Description |
---|---|
id | integer The handler ID. |
dataSourceId | integer The ID of the data source. |
warnings | string This message describes issues with the created data source, such as the data source being unhealthy. |
connectionString | string The connection string used to connect the data source to Immuta. |
Request example
The following request saves the provided connection information (in example-payload.json
) as a data source.
curl \
--request POST \
--header "Content-Type: application/json" \
--header "Authorization: Bearer dea464c07bd07300095caa8" \
--data @example-payload.json \
https://your-immuta-url.com/azureblob/handler
Request payload example
{
"handler": {
"metadata": {
"tagAttributes": [],
"eventTimeAttribute": "",
"useDirectoryForTags": false,
"sasToken": "?sv=your=sas?token",
"sasTokenUrl": "https://your.blob.example.windows.net/sastoken-url",
"container": "demodata"
}
},
"dataSource": {
"blobHandler": {
"scheme": "https",
"url": ""
},
"blobHandlerType": "Azure Blob Storage",
"recordFormat": "",
"type": "ingested",
"name": "dev",
"sqlTableName": "dev"
}
}
Response example
{
"id": 18,
"dataSourceId": 18
}
Get information about a data source
Endpoint
Method | Path | Purpose |
---|---|---|
GET | /azureblob/handler/{handlerId} |
Return the handler metadata associated with the provided handler ID. |
Query Parameters
Attribute | Description | Required |
---|---|---|
handlerId | integer The specific handler ID. |
Yes |
skipCache | boolean If true , the handler cache will be skipped when retrieving the handler data. |
No |
Response Parameters
Attribute | Description |
---|---|
dataSourceId | integer The data source ID. |
value | array Details regarding the handler, including container , accountname , sasTokenURL , ingestUserId , tagAttributes , dataSourceName , refreshInterval , eventTimeAttribute , useDirectoryForTags . |
Request example
The following request returns the handler metadata associated with the provided handler ID.
curl \
--request GET \
--header "Content-Type: application/json" \
--header "Authorization: Bearer dea464c07bd07300095caa8" \
https://your-immuta-url.com/azureblob/handler/67
Response example
{
"dataSourceId": 427,
"metadata": {
"container": "integration",
"accountName": "integration-tests",
"sasTokenUrl": "https://your.blob.example.windows.net/",
"ingestUserId": "azure blob storage_indexer_example",
"tagAttributes": [],
"dataSourceName": "Test",
"refreshInterval": 0,
"eventTimeAttribute": "",
"useDirectoryForTags": false
},
"type": "azureBlobStorageHandler",
"connectionString": "integration-tests/integration",
"remoteTableDescription": null,
"id": 427,
"createdAt": "2021-09-22T18:45:47.744Z",
"updatedAt": "2021-09-22T18:45:47.969Z"
}
Manage data sources
Method | Path | Purpose |
---|---|---|
PUT | /azureblob/handler/{handlerId} |
Update the provided information for an Azure Blob Storage data source. |
PUT | /azureblob/bulk |
Update the handler metadata associated with the provided connection string. |
PUT | /azureblob/handler/{handlerId}/crawl |
Re-crawl the data source and update the metadata. |
Update a specific data source
Endpoint
Method | Path | Purpose |
---|---|---|
PUT | /azureblob/handler/{handlerId} |
Update the provided information for an Azure Blob Storage data source. |
Query Parameters
Attribute | Description | Required |
---|---|---|
handlerId | integer The specific handler ID. |
Yes |
skipCache | boolean When true , will skip the handler cache when retrieving metadata. |
No |
Response Parameters
Attribute | Description |
---|---|
id | integer The ID of the handler. |
dataSourceId | integer The data source ID. |
metadata | array Details regarding the updated information. |
Request example
The following request with the payload below updates the metadata for the data source with the handler ID 18
.
curl \
--request PUT \
--header "Content-Type: application/json" \
--header "Authorization: Bearer dea464c07bd07300095caa8" \
--data @example-payload.json \
https://your-immuta-url.com/azureblob/handler/18
Payload example
{
"dataSourceId": 18,
"metadata": {
"container": "testdata",
"accountName": "integration-tests",
"sasTokenUrl": "https://your.blob.example.windows.net/",
"ingestUserId": "azure blob storage_indexer_example",
"tagAttributes": [],
"dataSourceName": "dev",
"refreshInterval": 0,
"eventTimeAttribute": "",
"useDirectoryForTags": false
},
"type": "azureBlobStorageHandler",
"connectionString": "your/testdata",
"remoteTableDescription": null,
"id": 18,
"createdAt": "2021-09-23T18:47:52.976Z",
"updatedAt": "2021-09-23T18:47:53.194Z"
}
Response example
{
"id": 18,
"dataSourceId": 18,
"metadata": {
"sasToken": "2:your?sastoken==",
"container": "testdata",
"accountName": "your-account-name",
"sasTokenUrl": "2:your?sastokenurlTS",
"ingestAPIKey": "996samplee89c1apia7ckey9",
"ingestUserId": "azure blob storage_indexer_example",
"tagAttributes": [],
"dataSourceName": "dev",
"refreshInterval": 0,
"eventTimeAttribute": "",
"useDirectoryForTags": false
}
}
Update multiple data sources
Endpoint
Method | Path | Purpose |
---|---|---|
PUT | /azureblob/bulk |
Update the data source metadata associated with the provided connection string. |
Query Parameters
None.
Payload Parameters
Attribute | Description | Required |
---|---|---|
handler | metadata Includes metadata about the handler, such as ssl , port , database , hostname , username , and password . |
Yes |
connectionString | string The connection string used to connect to the data sources. |
Yes |
Response Parameters
Attribute | Description |
---|---|
bulkId | string The ID of the bulk data source update. |
connectionString | string The connection string shared by the data sources bulk updated. |
jobsCreated | integer The number of jobs that ran to update the data sources; this number corresponds to the number of data sources updated. |
Request example
The following request updates the autoIngest
value to true
for data sources with the connection string
specified in the payload below.
curl \
--request PUT \
--header "Content-Type: application/json" \
--header "Authorization: Bearer dea464c07bd07300095caa8" \
--data @example-payload.json \
https://your-immuta-url.com/azureblob/bulk
Payload example
{
"ids": [
5, 6
],
"connectionString": "integration-tests/integration",
"handler": {
"metadata": {
"autoIngest": true
}
}
}
Response example
{
"bulkId": "bulk_ds_update_dd2600809bf8418dbea2706d6f456636",
"connectionString": "integration-tests/integration",
"jobsCreated": 0
}
Re-crawl the data source
Endpoint
Method | Path | Purpose |
---|---|---|
PUT | /azureblob/handler/{handlerId}/crawl |
Re-crawls the data source and updates the metadata. |
Query Parameters
Attribute | Description | Required |
---|---|---|
HandlerId | integer The specific handler ID. |
Yes |
Response Parameters
The response returns a string of characters that identify the job run.
Request example
The following request re-crawls the data source.
curl \
--request PUT \
--header "Content-Type: application/json" \
--header "Authorization: Bearer dea464c07bd07300095caa8" \
https://your-immuta-url.com/azureblob/hanfler/427/crawl
Response example
a4de5af0-1be1-11ec-8131-6fe77107bfa9