Creating datasets

Creating your first Count Metrics dataset.

Datasets can either be auto-generated as part of the 'generate view from table' workflow, or they can be created from scratch in the catalog YAML editor.

Creating a dataset

To create a dataset open the catalog YAML editor by clicking on Edit catalog

Click the + next to Dataset

A new dataset starts as a blank template, and you’ll need to define its views, joins, and attributes before it becomes functional. You can rename your dataset file in the Datasets file directory by double clicking on the file. By default its name will be 'Untitled'

Schema

Show the dataset YAML schema

name The name of the dataset. This must be unique within the catalog. Defaults to the file name. Optional.

label A user-friendly label for the dataset. This is what is displayed in the UI, and defaults to the name. Optional.

description A description of the dataset. Optional.

from The name of the base view of the dataset.

join A list of joins that are applied to the base view. Optional.

join[*].view The name of the view that is being joined to the base view.

join[*].constraint The condition that the join is made on. This is defined in the dialect detailed here.

join[*].relationship The relationship between the base view and the joined view. This can be one of one_to_one, one_to_many, many_to_one or many_to_many.

join[*].type The type of join. This can be one of inner, left, right, or full. Optional.

Auto-generated Dataset YAML

If you generate views directly from tables, datasets are created for you automatically. Each auto-generated dataset YAML leads with a descriptive comment identifying it as auto-generated, the name of dataset is, by default, the same as the name of the base view of the dataset. A user friendly label is generated, this will be how the dataset appears to users in their projects.

# Auto-generated by Count
name: spotify_artists
label: Artists

A dataset can reference one or more views. The from attribute contains the base view of the dataset, in SQL this is the first view in the from clause. Datasets containing only one view will only have the from attribute.

from: spotify_artists

When multiple views share fields with the same name and datatype, they are combined into a single dataset with default joins on those fields. Otherwise, one dataset is created per view. Each additional view in the dataset is referenced in the join group. For each join view contains the name of the view to be joined, constraint the join condition, and relationship the relationship cardinality between the base view and the joined view.

join:
  - view: spotify_genres
    constraint: spotify_genres.artist_id = spotify_artists.artist_id
    relationship: one_to_many

The schema displays a list of available options you can use in the dataset YAML file.

Last updated