Local DuckDB cells
Query data directly in your browser
Last updated
Query data directly in your browser
Last updated
In addition to running queries on your data warehouse, Count also ships with a local database built on DuckDB. If you choose to set a cell's data source to Local, the queries for that cell will be run in your browser.
The diagram below shows how the different cell types are related:
In summary:
Cells running on a data warehouse can query tables in that database, and other cells referencing the same database.
Local cells can query the results of any other cell, as well as any CSV files that have been uploaded to the canvas.
Local cells are ideal for certain situations, including:
Joining multiple data sources - for example joining a CSV file to a database table, or joining results between databases.
Reducing load on your database - queries that run locally will not run on your database, and therefore not impact database query quotas or cost.
Performance - when performing an analysis or building a visual with many data points, it can be faster to first prototype locally then revert back to live data on your database for the full results.
First, ensure that the Local database is selected in the Sources pane in the data bar:
Secondly, create a cell using the control bar or keyboard shortcut as usual - you can confirm that the cell is local by checking the color of the cell name, the name of the source in the cell footer, or from the information in the right-hand sidebar.
With a local cell selected, the options in the Data Bar will update to indicate which cells are available for querying. To query a cell just refer to it by name in the FROM
clause of your select statement as usual.
The syntax for DuckDB is designed to be compatible with Postgres syntax.
Under Tables you'll see:
CSV files - indicated with a different icon
Cells and visuals from your data warehouse - annotated with relevant color code
Under Cells you'll see:
All local cells
Once you have referred to a cell, you'll notice that the DAG connector line changes color to match the source cell. This is intended as a quick visual indicator that the referenced cell has potentially come from a different database.
The cell controls in the right-hand sidebar show the database for the currently selected cell or visual. A cell can be made local by selecting Local from this menu. Note that the SQL syntax supported by DuckDB may be different from your data warehouse, so you may need to update the contents of the cell.
If you move a cell into the local database, any cells that reference it will return errors.
One concept to be aware of when using local cells is the risk of working with incomplete results. For example, suppose a canvas contained two cells:
Cell a - select price from transactions
- this cell returns a 10,000 row preview of a result set that actually contains 100,000 rows.
Local cell b - select sum(price) from a
- here the aggregation will not be correct, as the results from cell a are incomplete.
By default, Count applies a 10,000 row limit to all queries executed on your database. If the results of a query exceed this limit, a notification will be displayed in the table footer:
To increase this limit, you may click on the notification to attempt to download the full results set. An additional limit of 100MB for the total size of results also applies - if this limit is breached, then the notification becomes a warning:
To remove this warning, you can try:
Selecting fewer columns
Adding an explicit limit to the query
Performing an aggregation or filter so that fewer rows are returned
Any local cells that rely on incomplete results will also show a warning, even if those cells don't exceed any limits on their own:
If you have removed the default row limit for a cell and would like to re-apply it, select the option from the right-hand sidebar:
When compiling SQL, the behaviour for local cells is:
All parent local cells will have their queries compiled into CTEs as usual.
All parent cells from data warehouses will be referred to as if they were database tables - their queries will not be included as CTEs.
All compiled SQL will be created in the DuckDB syntax.
Unlike queries on your data warehouse, Count does not limit the result size of local queries.
However, the total memory usage of the local database is limited to approximately 3.4GB. During query execution Count will load the results of the cells being queried into the database and perform the query. Therefore any query has to obey the memory cap that includes:
The results of the query
The results of any parent cell
Any working memory the local database needs to perform the query
If you encounter any out-of-memory errors, try:
Deleting cells that are no longer required for local queries
Performing more aggregation/filtering in data warehouse queries first before referencing those results in local cells
Avoid cross-joins and other constructs that produce a large number of rows in your local cells
You can check your memory usage by executing the following query in a DuckDB cell:
If you see this error, it's probably because you're using a version of the Safari browser that applies memory limitations to background tasks in web pages (see more technical detail here). DuckDB runs in the background because otherwise long-running queries would cause the application to freeze.
To fix this error, you can try:
Reducing the complexity of the query - for example, relying on fewer upstream cells or using fewer compound query features such as CTEs or UNIONs. This may help DuckDB to use less memory when executing your query.
Switching to a different browser - some other browsers have higher memory limits for background tasks, so may succeed in executing your query.