💡Tips for using dbt in Count

Introduction

The canvas is a great space to understand and iterate your dbt models, but it's different to the code editor you're probably used to working in. This page provides some advice for getting the most out of your dbt connection in Count, and making the canvas the best dbt IDE around.

View model lineage

The dbt metadata pane for a model lists the immediate upstream dependencies. Click on these dependencies to navigate upstream through the model DAG.

Another option is to add a model to the canvas with a selection of its upstream and downstream neighbours:

If your workspace is on an Enterprise plan, you may want to disable automatic cell execution before adding lots of models just to view their lineage.

Debug a model

When a model is not producing the expected results, or a test is failing, a canvas is a helpful space to debug that model with your team. If the model is complex and contains lots of CTEs, a good first step is to add the model to the canvas and explode it:

In the following example, the model was returning an error and causing a test to fail. By exploding the model and adding it to the canvas, it is clear that the error lies in one of the model CTEs:

Once the error is fixed and verified, the full model code can be copied by exporting the Jinja-SQL, or even committed back to GitHub.

Test a model in different environments

While building a model, at some point you'll probably want to preview what your model will look like in your production environment.

The canvas is a great place to do these comparisons, just:

  1. Add connections to your production and staging environments

  2. Add dbt integrations to your connections

  3. In a new canvas, write your new model in a cell connected to your staging connection

  4. Duplicate the cell, and set the source to your production connection

You now have a live preview of your model running in both environments, where you can, for example:

  • Compare returned rows

  • Compute sense-check derived metrics like counts, totals and averages in other cells

  • Visualise column distributions

Compare live outputs of models

You may already be using tools like DataDiff to compare the outputs of your models during your continuous integration jobs. This is good practice, and it's even more helpful to extend this diffing process to give live results while developing your models.

One way of achieving this in the canvas is to add a local cell that's always checking your in-progress model results with a known-good result set.

For example, continuing with the example above, suppose you're trying to remove a hard-coded value in a filter. In the example below, we use a DuckDB EXCEPT operator to find any rows in one table that aren't present in another. As soon as this condition is breached, a big and obvious warning is displayed:

As the canvas is such a flexible space, you're free to hide this logic in a frame, add any additional formatting etc.

The following macro is an example of how you might want to generalise this approach:

{% macro except_diff(table_1, table_2, error, success) %}
  with new_rows as (select * from {{table_1}} except select * from {{table_2}})
  select if(count(*) > 0, '❌ {{error}}', '✅ {{success}}') as status
  from new_rows
{% endmacro %}

Last updated