Skip to main content

File Structure

your-project/
├── config.yml
├── globals/                    # Optional: Global reusable definitions
│   └── semantics.yml
├── semantics.yml              # Optional: Global dimension mappings
└── semantics/
    ├── topics/
    │   └── *.topic.yml
    └── views/
        └── *.view.yml

Global Semantics

Global semantics allow reusable definitions in globals/semantics.yml:
entities:
  customer:
    name: customer
    type: primary
    description: Customer entity
    key: customer_id

dimensions:
  customer_id:
    name: customer_id
    type: number
    description: Unique customer identifier
    expr: customer_id

measures:
  total_orders:
    name: total_orders
    type: count
    description: Total number of orders

descriptions:
  order_status: "Current status of the order"

tables:
  orders: "dim_orders"
  customers: "dim_customers"
Inheritance Syntax:
# In semantic views
dimensions:
  - inherits_from: globals.semantics.dimensions.customer_id
    # Additional properties...

# In semantics.yml
dimensions:
  - name: customer_id
    inherits_from: globals.semantics.dimensions.customer_id
    targets:
      - database.schema.table.column
Template Variables:
table: "{{globals.semantics.tables.orders}}.csv"
description: "{{globals.semantics.descriptions.order_status}}"
See Global Semantics for detailed information.

View Definition

name: string                    # Required: Unique view identifier
description: string             # Required: What this view represents
datasource: string              # Required: Datasource name
table: string                   # Conditional: Table reference OR
sql: |                          # Conditional: Custom SQL query
  SELECT ...

entities: []                    # Required: List of entities
dimensions: []                  # Required: List of dimensions  
measures: []                    # Optional: List of measures

Entity Definition

- name: string                  # Required: Entity identifier
  type: primary|foreign         # Required: Entity type
  description: string           # Required: Entity description
  key: string                   # Required: Dimension to use as key
Types:
  • primary: Main entity in this view (one per view)
  • foreign: Reference to entity in another view

Dimension Definition

- name: string                  # Required: Dimension identifier
  type: string|number|date|     # Required: Data type
        datetime|boolean
  description: string           # Optional: Dimension description
  expr: string                  # Required: SQL expression
  samples: [string]             # Optional: Example values
  synonyms: [string]            # Optional: Alternative names
  primary_key: boolean          # Optional: Is primary key
Types:
  • string: Text and categorical data
  • number: Numeric values
  • date: Date values (YYYY-MM-DD)
  • datetime: Date and time values
  • boolean: True/false values

Measure Definition

- name: string                  # Required: Measure identifier
  type: count|sum|average|      # Required: Aggregation type
        min|max|count_distinct|
        median|custom
  description: string           # Optional: Measure description
  expr: string                  # Conditional: Expression for aggregation
  sql: string                   # Conditional: Custom SQL (for custom type)
  filters: []                   # Optional: Filters to apply
  samples: [number]             # Optional: Example values
  synonyms: [string]            # Optional: Alternative names
Types:
  • count: Count of records
  • sum: Sum of values
  • average: Arithmetic mean
  • min: Minimum value
  • max: Maximum value
  • count_distinct: Count unique values
  • median: Median value
  • custom: Custom SQL expression

Measure Filters

filters:
  - expr: "{{dimension_name}} operator value"
Examples:
filters:
  - expr: "{{total_amount}} >= 1000"
  - expr: "{{status}} = 'completed'"
  - expr: "{{customer.segment}} = 'Enterprise'"

Topic Definition

name: string                    # Required: Topic identifier
description: string             # Required: Business domain description
views: [string]                 # Required: List of view names
base_view: string               # Optional: Base view for query optimization
default_filters: []             # Optional: Default filters for all queries

Default Filters

Apply filters automatically to all queries in a topic:
default_filters:
  - field: string               # Field to filter (e.g., "status", "tenant_id")
    filter_type:                # Filter operator with value
      eq:                       # Operator: eq, neq, gt, gte, lt, lte, in, not_in, in_date_range, not_in_date_range
        value: any              # Value to filter by
Examples:
# Single equality filter
default_filters:
  - field: status
    filter_type:
      eq:
        value: "active"

# Multiple filters (AND logic)
default_filters:
  - field: tenant_id
    filter_type:
      eq:
        value: 1
  - field: deleted_at
    filter_type:
      eq:
        value: null

# Array filter
default_filters:
  - field: status
    filter_type:
      in:
        values: ["pending", "processing", "shipped"]

# Date range filter
default_filters:
  - field: order_date
    filter_type:
      in_date_range:
        from: "2024-01-01"
        to: "2024-12-31"

# Numeric comparison
default_filters:
  - field: total_amount
    filter_type:
      gte:
        value: 1000
Use Cases:
  • Multi-tenancy (filter by tenant/org)
  • Data governance (restrict to specific subsets)
  • Active records only (exclude soft-deleted)
  • Time-based defaults (recent data only)

Cross-View References

Reference dimensions or measures from related views using {{view.field}} syntax:
measures:
  - name: total_items
    type: sum
    expr: "{{order_items.quantity}}"

Common Patterns

Basic Table View

name: customers
description: "Customer master data"
datasource: "main"
table: "public.customers"

entities:
  - name: customer
    type: primary
    description: "Customer entity"
    key: customer_id

dimensions:
  - name: customer_id
    type: number
    expr: id
  - name: customer_name
    type: string
    expr: name

measures:
  - name: total_customers
    type: count

SQL-Based View

name: customer_metrics
description: "Aggregated customer metrics"
datasource: "analytics"
sql: |
  SELECT 
    customer_id,
    COUNT(order_id) as order_count,
    SUM(total_amount) as lifetime_value
  FROM orders
  GROUP BY customer_id

entities:
  - name: customer
    type: primary
    key: customer_id

dimensions:
  - name: customer_id
    type: number
    expr: customer_id
  - name: order_count
    type: number
    expr: order_count

measures:
  - name: avg_lifetime_value
    type: average
    expr: lifetime_value

View with Relationships

name: orders
description: "Order transactions"
datasource: "main"
table: "orders"

entities:
  - name: order
    type: primary
    key: order_id
  - name: customer
    type: foreign
    key: customer_id        # Links to customers.customer_id
  - name: product
    type: foreign
    key: product_id         # Links to products.product_id

# ... dimensions and measures

Filtered Measure

measures:
  - name: high_value_orders
    type: count
    description: "Orders over $1000"
    filters:
      - expr: "{{total_amount}} >= 1000"
      
  - name: enterprise_revenue
    type: sum
    expr: total_amount
    filters:
      - expr: "{{customer.segment}} = 'Enterprise'"

Dimension with Synonyms

dimensions:
  - name: order_status
    type: string
    expr: status
    samples: ["pending", "shipped", "delivered"]
    synonyms: ["status", "state", "order_state"]

Custom Calculation

measures:
  - name: revenue_per_customer
    type: custom
    description: "Average revenue per customer"
    sql: |
      SUM(total_amount) / NULLIF(COUNT(DISTINCT customer_id), 0)

Topic with Default Filters

name: tenant_sales
description: "Sales analytics scoped to specific tenant"
views:
  - orders
  - customers
  - products
default_filters:
  - field: orders.tenant_id
    filter_type:
      eq:
        value: 1
  - field: orders.status
    filter_type:
      in:
        values: ["completed", "shipped"]

Topic with Date Range Filter

name: recent_activity
description: "Activity from last 30 days"
views:
  - events
  - users
default_filters:
  - field: events.created_at
    filter_type:
      in_date_range:
        from: "{{ subtract_days(current_date(), 30) }}"
        to: "{{ current_date() }}"

Validation Checklist

  • Each view has a unique name
  • Each view has exactly one primary entity
  • All foreign entities reference primary entities in other views
  • All dimension expr values are valid SQL
  • Measure types match their expressions
  • Cross-view references use correct syntax: {{view.field}}
  • Topic view references exist
  • Datasource names match configuration
  • Default filter fields reference valid dimensions in topic views
  • Default filter operators match field data types

Common Errors

Missing Entity Key

# ❌ Wrong
entities:
  - name: customer
    type: primary

# ✅ Correct
entities:
  - name: customer
    type: primary
    key: customer_id

Invalid Cross-View Reference

# ❌ Wrong
expr: order_items.quantity

# ✅ Correct
expr: "{{order_items.quantity}}"

Missing Expression for Aggregation

# ❌ Wrong
measures:
  - name: total_revenue
    type: sum

# ✅ Correct
measures:
  - name: total_revenue
    type: sum
    expr: total_amount

Orphaned Foreign Entity

# ❌ Wrong - No corresponding primary entity
# orders.view.yml
entities:
  - name: supplier
    type: foreign
    key: supplier_id

# ✅ Correct - Define primary entity
# suppliers.view.yml
entities:
  - name: supplier
    type: primary
    key: supplier_id

File Naming Conventions

  • Views: <name>.view.yml (e.g., orders.view.yml)
  • Topics: <name>.topic.yml (e.g., sales.topic.yml)
  • Use lowercase with underscores for multi-word names
  • Keep names descriptive and business-friendly

See Also