Count
count.co
  • Getting Started
    • What is Count?
    • Count FAQ
    • Intro to your workspace
    • Example canvases
    • Getting started guides
      • Set up your workspace and projects
        • 1. Review workspace settings
        • 2. Create and organise your projects
      • Canvas orientation
      • Your first ad hoc analysis
        • 1. Examples and templates
        • 2. Build your first queries
        • 3. Create visuals
        • 4. Caching, local cells and scheduling
        • 5. Collaborating with a stakeholder
      • Your first report
        • 1. Examples and templates
        • 2. Filters and control cells
        • 3. Sharing your report
        • 4. Alerts
  • Connect your data
    • Database connection overview
      • Athena
      • Azure Synapse
      • BigQuery
      • Databricks
      • Microsoft SQL Server
      • MySQL
      • PostgreSQL
      • Redshift
      • Snowflake
    • Refresh database schema
    • Upload CSV files
    • dbt integration
      • ☁️dbt Cloud integration
      • 👩‍💻dbt Core integration
  • Import & Export
    • Import from other tools
      • Import Miro files
      • Import SQL files
      • Import Google Sheets
      • Import Jupyter notebooks
    • Export code and results
      • Export compiled SQL and Jinja-SQL
      • Export CSV files
      • Export images and PDF files
  • THE CANVAS
    • Navigating the canvas
      • Canvas tool bar
      • Data sidebar
      • Customizing the canvas
    • Canvas objects
      • Cells
      • Text and markdown
      • Shapes and tools
      • Sticky notes
      • Frames
      • Images
      • Embeds
      • Stamps
      • Grouping objects
      • Object order and alignment
      • Locking objects
      • Scaling objects
      • Shared styles
    • Overviews
    • Templates
    • Count AI
    • Alerts and subscriptions
      • Slack integration
    • Keyboard shortcuts
  • Querying data
    • Cells overview
      • Dynamic query compilation engine
    • SQL cells
      • Referencing other cells
      • Jinja templating
      • SQL formatting
    • Python cells
    • Visual and low-code cells
      • Calculations in visuals and low-code cells
      • Joins in visuals and low-code cells
    • Control cells
      • Single and multiple selects
      • Date controls
      • Text, number, and boolean controls
      • Custom control cells
    • Local DuckDB cells
      • DuckDB on the server
    • Query caching and scheduling
    • Manage queries and results
    • Troubleshooting
  • Visualizing data
    • Visualization overview
    • Templated visuals
    • Custom visuals
      • Marks
      • Facet
      • Subplots
      • Style
      • Filters
    • Formatting a visual
      • Axes
        • Secondary Axis
      • Colors and labels
      • Legends
      • Tooltips
    • Column summaries
    • Dynamic text
    • Every Visual Under the Sun
  • Presenting and Reporting
    • Reports and Slides
  • Count Metrics
    • Intro to Count Metrics
    • Build and edit a catalog
    • Views
      • Creating views
      • Customizing views
    • Datasets
      • Creating datasets
    • Save changes to the catalog
      • Catalog validation
      • Version control
    • Exposing catalogs to the workspace
    • Caching in Count Metrics
    • Using the catalog
      • Explore from cell
  • Sharing and Permissions
    • Real-time collaboration
    • Comments
    • Sharing permissions
    • Shared links
    • Embedding canvases
  • History and Versions
    • Version control
    • Duplicating and merging
    • Data snapshots
  • Settings and administration
    • Workspace settings
      • Workspace members
      • Groups
      • Tags
      • Billing
      • Single sign-on (SSO)
        • Okta
        • Entra ID
        • JumpCloud
        • Google
        • Generic OIDC
      • Brand
    • Connection settings
    • Project settings
    • User settings
    • Roles and permissions
  • Quick guides
    • Interactive control guides
      • Date ranges
      • Date groupings
      • Search
      • Select All
  • Resources
    • Join the Slack community
  • Blog
  • Security overview
  • Terms of use
  • Pricing FAQ
Powered by GitBook
On this page
  • Click "Connect a database"
  • Enter your connection details
  • Authentication options
  • Connect
  • Troubleshooting
  • Limitations
  1. Connect your data
  2. Database connection overview

Databricks

How to connect Databricks to Count.

PreviousBigQueryNextMicrosoft SQL Server

Last updated 3 months ago

Click "Connect a database"

Enter your connection details

You'll need:

  • Hostname and path of your Databricks cluster

You may also need to ensure that your firewall accepts requests from the following IP addresses:

34.107.75.117
35.198.189.90
35.234.110.177
104.155.13.208

Authentication options

Access token

To connect using an access token, select this option from the Authentication options dropdown, enter the required values and click Connect.

OAuth

  1. Give a sensible name for the connection (e.g. Count-OAuth) and select "All APIs" for the Access scopes.

  2. Set the Redirect URL to be https://count.co/connection-oauth.

  3. By default, the Refresh token TTL is set to 7 days. Every 7 days, Count's connection will need to be re-authorized (see below) to re-enable to the connection to Databricks. Count recommends a value between 1 week (10080) and 90 days (129600) to avoid the need to re-authorize too often.

  4. Once created, make a note of the Client ID and the Client secret, as these will be required to create the connection in Count.

  1. Navigate to your Count workspace, click Connect a database, and choose Databricks.

  2. Select OAuth in the Authentication options dropdown, and enter your Host, and the Client ID obtained in step 3.

  3. When you click Authorize, you will be redirected to your usual Databricks login flow (including an MFA required by your account).

  4. Once you have logged successfully authorized on the Databricks side, you will be automatically redirected back to your Connect to Databricks page in Count. You'll now be asked for the value of the Client secret obtained in step 3, your Database name, and any other optional properties you wish to configure, including Refresh token validity (see below). The initial code retrieved from Databricks has a very short lifetime (1 minute). Please make sure you complete this step within that time.

  5. Click Connect to complete the connection.

Set the value of Refresh token validity in your Count Databricks connection to the value of 60 * Refresh token TTL (i.e. the value of the TTL in seconds) to receive email reminders of when re-authorization is due.

If an error is encountered during step 7, you will need to re-authorize once again (steps 5 - 7).

Re-authorization

The refresh token obtained by Count from Databricks during step 7 enables Count to obtain a short lived (60 minutes by default) session token from Databricks to, amongst other things, run queries against your database. This refresh token itself has an expiry given by the value of Refresh token TTL set above. The connection will therefore need to be re-authorized whenever it is due to expire.

Navigate to your Databricks connection settings page, and click Re-authorize to regenerate a new refresh token (steps 5 - 7 above).

OAuth Service Principal

To connect using OAuth Service Principal:

  1. Enter your Host, Path, Client Id, and Client Secret (and other optional parameters) then click connect.

Connect

Once you hit Connect you will see your connection details including a full list of tables accessible with this connection.

Troubleshooting

The most common issue common issue when connecting to a Databricks cluster is that the warehouse is unresponsive if it has stopped due to inactivity. Count's connecting and querying logic will retry every 10 seconds for 3 minutes. If Count is still unable to connect to your cluster after this time, a 503 error will return. Ensure that your warehouse is running and available to accept connections from the Count IP addresses (see above) and retry the connection in Count.

Limitations

BigInt

VALUES clause

Count uses a reduced parse to determine which cells are referenced in your SQL statements. Databrick's SELECT statement and querying syntax is more flexible than those of other databases. As a result, there is an unresolvable ambiguity in Count's reduced parse for statements of the form

SELECT * FROM VALUES 1, 2, 3

Running a SQL statement similar to this will result in an error similar to "Unexpected number", "Unexpected literal", etc. The workaround for this issue is to parenthetise the row values in the VALUES clause, e.g.

SELECT * FROM VALUES (1), (2), (3)

Use to obtain your hostname and path Use to obtain an access token

To connect using OAuth, you must first create a in your Databricks account console.

in the Databricks account console

to that Service Principal

for that Service Principal, making a note of the Client ID and Client Secret.

If you are having trouble connecting, to us to schedule a support session

The data type in Databricks can accommodate integers in the range -2^63 -> 2^63 -1 . Due to a limitation in the Databricks client, Count can accommodate a reduced range of -2^53 -> 2^53 -1 for this data type. Values outside of this reduced range will be displayed as null values.

this guide
this guide
custom OAuth application
Create a Service Principal
Assign workspace level permissions
Create an OAuth secret
reach out
BigInt