The Databricks integration is in Beta for development, and may have limitations. To get early access, contact your Sana counterpart.
Introduction
The Sana Agents Databricks integration enables live, real-time querying of your Databricks data warehouse. Databricks data is not indexed. Instead, users connect directly via secure authentication, select which datasets or views to expose, and can query up-to-date data at any time.
Integration Capabilities
Live connection: Query Databricks data in real time—no pre-indexing required.
Customizable access: Select and describe which datasets or views are available for querying. Descriptions help improve query translation and accuracy.
Private integration: Permissions are managed via OAuth.
Task library: Create and share pre-defined queries (tasks) to standardize analytics workflows across your organization.
Type of Integration
Private: Set up via OAuth service account, available for usage by the authenticated user.
Availability
This integration is currently in development for the Enterprise tier.
Integration setup
Navigate to Integrations:
Go to the Integrations section in Sana Agents and select Databricks.Authenticate with OAuth Service Account:
Follow the prompts to authenticate using your Databricks OAuth service account. This ensures secure, managed access without the need for manual credential sharing.Define tables/views:
Enter a name and description for each table or view you want to connect.
Specify the relevant identifiers as they appear in Databricks.
Tip: Both the name and description are used by the agent to construct accurate SQL queries, so use clear, natural language.
Fetch and describe columns:
After defining a view, click “Fetch and describe columns.”
You’ll see all columns returned by the view.
For each column, you can (optionally, but recommended) add a description in natural language.
Tip: The more descriptive you are, the better the agent will be at constructing relevant queries.
Set Integration name, description, and access:
In the final step, set the name and description for the entire integration (also used by the agent for query construction).
Save configuration:
Finalize setup by saving your selections.
Editing integrations
To edit views, columns, or descriptions after setup:
Find the integration in the “Connected integrations” table.
Click on the integration, then the settings cog to edit.
Make your changes, then save.
Using the integration
Querying data
You can simply ask your question in chat. The agent automatically uses all the metadata—names and descriptions of integrations, views, and columns—that you’ve provided to construct accurate SQL queries.
If you have multiple data sources connected (such as Google Drive, meetings, or other databases), including keywords from the integration name, description, or specific view or column names in your question can help the agent focus and improve accuracy. This is especially useful if there are similarly named datasets across sources.
Focusing on Databricks
In the chat interface, you can select “Databricks” in the sources section to restrict your query to only Databricks data. This is especially useful if you want to ensure your results come exclusively from your Databricks integration.
Task library
For recurring analytics, you can create and share pre-defined queries (“tasks”) to streamline common workflows.
Data Handling and Privacy
Sana Agents is fully committed to data security and privacy. All data accessed by Sana Agents is encrypted both in transit and at rest. Sana does not train any underlying language models on your data, ensuring the privacy of your information. Sana Agents is ISO 27001 certified; and SOC 2 and GDPR compliant, and adheres to the highest standards of data security.
For further information about Sana Agents or the Dropbox shared integration, please contact [email protected] via email.