Troubleshooting

A guide to resolving issues with queries

SQL query results appear different between Count and another tool

In performing a consistency check between Count and another tool, you may try running the same SQL query in both tools. Infrequently you may find the results of both queries appear different. The inconsistency is almost always the result of a subtlety of the SQL language or a difference in connection details. Below is a list of things to check before contacting Count support.

Differences in client connection details

If the user credentials attached to the Count connection and the other tool are different, please check that:

  • Both point to the same database/scope; modern data stacks often reproduce production schemas in a staging environment, so you may be accidentally querying different tables.

  • Both user credentials have the same level permissions for the tables being queried; this is particularly important if your database supports row level permissions.

As Count supports querying of multiple database connections within the same canvas, please check that the cell refers to the same database instance being queried in the other tool.

Differences due to execution time of the query

The time at which the query is executed can lead to different results between Count and the other tool in two ways:

  • At high load times your query may fail (in either tool), leading to an error for one query but not the other. Please interpret any errors returned and try performing both queries at a similar time.

  • For transactional databases, tables may be appended to at any time. As a result, both query results may not exactly match. Consider creating a snapshot table within the database and performing consistency checks on that instead.

Differences due to result representation

Count currently doesn't support the representation of object and nested datatypes within the Count UI. Columns containing this type of data will appear as a JSON-stringified representation of the data.

Differences due to non-deterministic queries

A deterministic query returns the same results each time it's called. A non-deterministic query may return different results each time it's called. Some queries are obviously non-deterministic e.g. SELECT CURRENT_TIME, but others may be non-deterministic due to a subtlety of the implementation of the SQL language by your database. Please check the query executed, including any SQL used to generate views referenced in the query. Below is a list of common causes of non-deterministic queries.

Non-deterministic functions

  • Time/date related functions may return different values in successive calls, and if used as a filter in your query, a different number of rows for your query. These functions include (but not limited to):

    • CURRENT_TIME, CURRENT_DATE, CURRENT_DATETIME, CURRENT_TIMESTAMP, or equivalent.

  • Some functions by definition return different values in successive calls, or are dependent on configurations in your database environment. These functions include (but not limited to):

    • GENERATE_UUID, SESSION_USER, RAND, or equivalent.

  • Some window functions are non-deterministic if they are navigational in nature when used without an ORDER BY clause within the OVER() clause. Some databases error when these functions are used without an ORDER BY clause, but may not check that the chosen ordering is appropriate. These functions include (but not limited to):

    • FIRST_VALUE, LAG, LAST_VALUE, LEAD, NTH_VALUE, PERCENTILE_CONT, PERCENTILE_DIST, CUME_DIST, RANK, NTILE, or equivalent.

Non-determinism due to floating point accuracy

Not all number types in databases can be stored with exact precision. This can lead to non-determinism when:

  • The number exceeds the maximum supported value of that data type.

  • Calculations using that magnify any floating point round-off errors, e.g. by subtracting two very similar floating point numbers.

Please consult your database documentation to find the bounds and precision of the data types used.

Non-determinism due to ordering

The SQL standard does not order query results for SQL statements without an ORDER BY clause. This detail commonly manifests itself in two ways:

  1. Executing the same SELECT * or SELECT column_name, ... query multiple times against a table may return results in a different order. This is to be expected.

  2. All database queries in Count are by default limited to the first batch of 10,000 rows. Successive SELECT * or SELECT column_name, ... queries may return different batches of data. This is particularly important if these cells are referenced within local (DuckDB or Python) cells or visuals.

When a Count cell is only showing a subset of the query result data, a label is shown in the bottom-right of the cell (or upper-right of a visual). Read more about how to handle these cases here.

To work within this restriction and obtain the quantity of data you require, consider performing any ordering operations, aggregations and filters on the original database query before referencing in local cells.

Please do contact Count support if this limit is still restricting your analysis.

If you are having trouble interpreting the difference between two sets of query results, reach out to us to schedule a support session

Last updated