Local DuckDB cells

Query data directly in your browser

In addition to running queries on your data warehouse, Count also ships with a local database built on DuckDB. If you choose to set a cell's data source to Local, the queries for that cell will be run in your browser.

The diagram below shows how the different cell types are related:

In summary:

  • Cells running on a data warehouse can query tables in that database, and other cells referencing the same database.

  • Local cells can query the results of any other cell, as well as any CSV files that have been uploaded to the canvas.

When to use local cells

Local cells are ideal for certain situations, including:

  • Joining multiple data sources - for example joining a CSV file to a database table, or joining results between databases.

  • Reducing load on your database - queries that run locally will not run on your database, and therefore not impact database query quotas or cost.

  • Performance - when performing an analysis or building a visual with many data points, it can be faster to first prototype locally then revert back to live data on your database for the full results.

Create a local cell

First, ensure that the Local database is selected in the Sources pane in the data bar:

Secondly, create a cell using the control bar or keyboard shortcut as usual - you can confirm that the cell is local by checking the color of the cell name, the name of the source in the cell footer, or from the information in the right-hand sidebar.

Querying other cells

With a local cell selected, the options in the Data Bar will update to indicate which cells are available for querying. To query a cell just refer to it by name in the FROM clause of your select statement as usual.

The syntax for DuckDB is designed to be compatible with Postgres syntax.

Under Tables you'll see:

  • CSV files - indicated with a different icon

  • Cells and visuals from your data warehouse - annotated with relevant color code

Under Cells you'll see:

  • All local cells

Once you have referred to a cell, you'll notice that the DAG connector line changes color to match the source cell. This is intended as a quick visual indicator that the referenced cell has potentially come from a different database.

Move a cell/visual to the local database

The cell controls in the right-hand sidebar show the database for the currently selected cell or visual. A cell can be made local by selecting Local from this menu. Note that the SQL syntax supported by DuckDB may be different from your data warehouse, so you may need to update the contents of the cell.

If you move a cell into the local database, any cells that reference it will return errors.

Query result limits

One concept to be aware of when using local cells is the risk of working with incomplete results. For example, suppose a canvas contained two cells:

  • Cell a - select price from transactions - this cell returns a 10,000 row preview of a result set that actually contains 100,000 rows.

  • Local cell b - select sum(price) from a - here the aggregation will not be correct, as the results from cell a are incomplete.

By default, Count applies a 10,000 row limit to all queries executed on your database. If the results of a query exceed this limit, a notification will be displayed in the table footer:

To increase this limit, you may click on the notification to attempt to download the full results set. An additional limit of 100MB for the total size of results also applies - if this limit is breached, then the notification becomes a warning:

To remove this warning, you can try:

  • Selecting fewer columns

  • Adding an explicit limit to the query

  • Performing an aggregation or filter so that fewer rows are returned

Any local cells that rely on incomplete results will also show a warning, even if those cells don't exceed any limits on their own:

If you have removed the default row limit for a cell and would like to re-apply it, select the option from the right-hand sidebar:

How does 'Copy compiled SQL' work for local cells?

When compiling SQL, the behaviour for local cells is:

  • All parent local cells will have their queries compiled into CTEs as usual.

  • All parent cells from data warehouses will be referred to as if they were database tables - their queries will not be included as CTEs.

All compiled SQL will be created in the DuckDB syntax.

What are the memory limitations of local cells?

Unlike queries on your data warehouse, Count does not limit the result size of local queries.

However, the total memory usage of the local database is limited to approximately 3.4GB. During query execution Count will load the results of the cells being queried into the database and perform the query. Therefore any query has to obey the memory cap that includes:

  • The results of the query

  • The results of any parent cell

  • Any working memory the local database needs to perform the query

If you encounter any out-of-memory errors, try:

  • Deleting cells that are no longer required for local queries

  • Performing more aggregation/filtering in data warehouse queries first before referencing those results in local cells

  • Avoid cross-joins and other constructs that produce a large number of rows in your local cells

You can check your memory usage by executing the following query in a DuckDB cell:

SELECT memory_usage, memory_limit from pragma_database_size()

Troubleshooting

Maximum call stack size exceeded

If you see this error, it's probably because you're using a version of the Safari browser that applies memory limitations to background tasks in web pages (see more technical detail here). DuckDB runs in the background because otherwise long-running queries would cause the application to freeze.

To fix this error, you can try:

  • Reducing the complexity of the query - for example, relying on fewer upstream cells or using fewer compound query features such as CTEs or UNIONs. This may help DuckDB to use less memory when executing your query.

  • Switching to a different browser - some other browsers have higher memory limits for background tasks, so may succeed in executing your query.

Last updated