Redshift Pre-Configuration Details
Audience: System Administrators
Content Summary: This page describes the Redshift integration, configuration options, and features.
For a tutorial to enable this integration, see the installation guide.
- For automated installations, the credentials provided must be a Superuser or have the ability to create databases and users and modify grants.
- Redshift Serverless.
- Redshift Spectrum. For configuration and data source registration instructions, see the configuration page.
The Redshift integration supports the following authentication methods to install the integration and create data sources:
- Username and Password: Users can authenticate with their Redshift username and password.
- AWS Access Key: Users can authenticate with an AWS access key.
- Okta: Users can authenticate with their Okta credentials when installing the integration with the manual configuration.
Immuta cannot ingest tags from Redshift, but you can connect any of these supported external catalogs to work with your integration.
Required Redshift Privileges
OWNERSHIP ON GROUP IMMUTA_IMPERSONATOR_ROLE
Immuta System Account:
GRANT EXECUTE ON PROCEDURE grant_impersonation
GRANT EXECUTE ON PROCEDURE revoke_impersonation
Impersonation allows users to query data as another Immuta user in Redshift. To enable user impersonation, see the User Impersonation page.
Users can enable multiple Redshift integrations with a single Immuta instance.
- The host of the data source must match the host of the native connection for the native view to be created.
- When using multiple Redshift integrations, a user has to have the same user account across all hosts.
- Registering Redshift datashares as Immuta data sources is unsupported.
Python UDF Specific Limitations
For most policy types in Redshift, Immuta uses SQL clauses to implement enforcement logic; however Immuta uses Python UDFs in the Redshift integration to implement the following masking policies:
- Masking using a regular expression
- Reversible masking
- Format-preserving masking
- Randomized response
The number of Python UDFs that can run concurrently per Redshift cluster is limited to one-fourth of the total concurrency level for the cluster. For example, if the Redshift cluster is configured with a concurrency of 15, a maximum of three Python UDFs can run concurrently. After the limit is reached, Python UDFs are queued for execution within workload management queues.
SVL_QUERY_QUEUE_INFO view in Redshift, which is visible to a Redshift superuser, summarizes details for queries
that spent time in a workload management (WLM) query queue. Queries must be completed in order to appear as results
If you find that queries on Immuta-built views are spending time in the workload management (WLM) query queue, you should either edit your Redshift cluster configuration to increase concurrency, or use fewer of the masking policies which leverage Python UDFs. For more information on increasing concurrency, see the Redshift docs on implementing workload management.