Skip to content

You are viewing documentation for Immuta version 2022.5.

For the latest version, view our documentation for Immuta SaaS or the latest self-hosted version.

Querying Databricks Data

Audience: Data Users

Content Summary: This page offers a tutorial on how to query data within the Databricks integration.

Prerequisites:

Query Data with Python

  1. Create a new workspace.
  2. Query the Immuta-protected data, which takes the form of database.table_name:
    1. Database: The database that houses the backing tables of your Immuta data sources.
    2. Table Name: The name of the table backing your Immuta data sources.
  3. Run your query, it should look something like:

    df = spark.sql('select * from database.table_name')
    df.show()
    

Databricks Python Query

Query Data with SQL

  1. Create a new workspace.
  2. Query the Immuta-protected data, which takes the form of database.table_name:
    1. Database: The database that houses the backing tables of your Immuta data sources.
    2. Table Name: The name of the table backing your Immuta data sources.
  3. Run your query. It should look something like this:

    select * from database.table_name;
    

Databricks SQL Query

Query Data with SparkR

Establish the User's Identity

  1. Create a new workspace.
  2. Run:

    library(SparkR)
    

Run a Query

  1. In the same workspace, but a different cell, query the Immuta-protected data, which takes the form of database.table_name:
    1. Database: The database that houses the backing tables of your Immuta data sources.
    2. Table Name: The name of the table backing your Immuta data sources.
  2. Run your query. It should look something like this:

    df <- SparkR::sql("select * from database.table_name")
    SparkR::head(df)
    

Databricks R Query

Query Data with Scala

  1. Query the Immuta-protected data, which takes the form of database.table_name:
    1. Database: The database that houses the backing tables of your Immuta data sources.
    2. Table Name: The name of the table backing your Immuta data sources.
  2. Run your query. It should look something like this:

    val sqlDF = spark.sql("select * from database.tablename")
    sqlDF.show()
    

Databricks Scala Query