Skip to main content

Building the Semantic Layer

Once you’ve defined your semantic layer with views, entities, dimensions, and measures, you need to build it before use:
oxy build
This command validates and compiles your semantic layer definitions, making them available to agents, workflows, and routing agents. Always run oxy build after creating or modifying semantic layer files to ensure your changes are picked up.

In Agents

Add the semantic_query tool to your agent to enable it to query the semantic layer directly. The agent can then answer business questions using your defined metrics and dimensions.

Basic Setup

agents/analytics.agent.yml
model: "openai-4o-mini"
description: "Data analyst agent that can answer business questions"

system_instructions: |
  You are a data analyst expert. Your task is to help users answer 
  questions based on data using the semantic layer.
  
  Use the semantic_query tool to query data. The tool gives you access
  to pre-defined business metrics and dimensions.

tools:
  - name: semantic_query
    type: semantic_query
    topic: ecommerce_analytics  # The topic to query

Tool Configuration

The semantic_query tool has the following properties:
PropertyTypeRequiredDescription
namestringYesUnique identifier for the tool
typestringYesMust be semantic_query
topicstringYesName of the semantic topic to query
When querying a topic with default_filters, those filters are automatically applied to all queries. User-provided filters are combined with default filters using AND logic. For example, if a topic has a default filter for tenant_id = 'abc123', every query will be scoped to that tenant regardless of additional filters specified by the user.

Example Queries

Once configured, your agent can handle queries like:
  • “What’s the total revenue by customer segment?”
  • “Show me the top 5 products by sales this month”
  • “What’s the average order value for each acquisition channel?”
The agent will use the semantic layer to understand available dimensions and measures, then construct appropriate queries.

Multiple Topics

You can add multiple semantic_query tools for different topics:
agents/multi_domain.agent.yml
model: "openai-4o-mini"
description: "Multi-domain analyst with access to various business areas"

tools:
  - name: sales_query
    type: semantic_query
    topic: sales

  - name: marketing_query
    type: semantic_query
    topic: marketing

  - name: finance_query
    type: semantic_query
    topic: finance

In Workflows

Use the semantic_query task type in workflows to execute structured queries against your semantic layer. This is ideal for automated reporting, data pipelines, and scheduled analytics.

Basic Workflow Task

workflows/sales_report.workflow.yml
name: weekly_sales_report
description: Generate weekly sales performance report

tasks:
  - name: sales_metrics
    type: semantic_query
    topic: ecommerce_analytics
    dimensions:
      - orders.order_status
      - customers.acquisition_channel
    measures:
      - orders.total_revenue
      - orders.total_orders
      - orders.avg_order_value
    orders:
      - field: orders.total_revenue
        direction: desc
    limit: 10

  - name: create_report
    type: agent
    agent_ref: agents/report_generator.agent.yml
    prompt: |
      Create a weekly sales report based on this data:
      {{ sales_metrics }}
      
      Include insights on:
      - Top performing channels
      - Order status breakdown
      - Key trends

Semantic Query Task Properties

PropertyTypeRequiredDescription
typestringYesMust be semantic_query
topicstringYesName of the semantic topic to query
dimensionsarrayNoList of dimensions to include (view.field format)
measuresarrayNoList of measures to calculate (view.field format)
filtersarrayNoFilters to apply to the query
ordersarrayNoSort order for results
limitnumberNoMaximum number of rows to return

Field Referencing

Reference dimensions and measures using the format view_name.field_name:
dimensions:
  - orders.order_date          # From orders view
  - customers.customer_name    # From customers view
  - products.product_category  # From products view

measures:
  - orders.total_revenue       # Sum measure from orders
  - orders.avg_order_value     # Average measure from orders
  - customers.total_customers  # Count measure from customers

Filtering Data

Apply filters to narrow down your results. These filters are combined with any default_filters defined in the topic using AND logic:
- name: high_value_orders
  type: semantic_query
  topic: ecommerce_analytics
  dimensions:
    - orders.order_id
    - customers.customer_name
  measures:
    - orders.total_revenue
  filters:
    - field: orders.total_amount
      operator: ">="
      value: 1000
    - field: orders.order_status
      operator: "="
      value: "delivered"
  orders:
    - field: orders.total_revenue
      direction: desc
  limit: 20
If the topic ecommerce_analytics has default filters (e.g., tenant_id = 'xyz'), they are automatically applied along with the filters specified above. All default filters and user filters must be satisfied.

Ordering Results

Control the sort order of your results:
orders:
  - field: orders.total_revenue
    direction: desc           # Sort by revenue descending
  - field: customers.customer_name
    direction: asc            # Then by customer name ascending

Advanced Example

workflows/customer_analysis.workflow.yml
name: customer_segmentation_analysis
description: Analyze customer segments and buying patterns

variables:
  min_orders:
    type: number
    description: Minimum number of orders for analysis
    default: 5

tasks:
  - name: customer_segments
    type: semantic_query
    topic: ecommerce_analytics
    dimensions:
      - customers.acquisition_channel
      - customers.customer_name
    measures:
      - orders.total_orders
      - orders.total_revenue
      - orders.avg_order_value
    filters:
      - field: orders.total_orders
        operator: ">="
        value: "{{ min_orders }}"
    orders:
      - field: orders.total_revenue
        direction: desc
    limit: 50

  - name: channel_performance
    type: semantic_query
    topic: ecommerce_analytics
    dimensions:
      - customers.acquisition_channel
      - orders.order_status
    measures:
      - orders.total_revenue
      - orders.total_orders
    orders:
      - field: customers.acquisition_channel
        direction: asc

  - name: generate_insights
    type: agent
    agent_ref: agents/analyst.agent.yml
    prompt: |
      Analyze this customer data and provide strategic insights:
      
      Customer Segments:
      {{ customer_segments }}
      
      Channel Performance:
      {{ channel_performance }}
      
      Focus on:
      1. Which channels drive the most value?
      2. Customer retention patterns
      3. Recommendations for optimization

In Routing Agents

Routing agents can include semantic topics as routes, enabling intelligent task routing based on semantic understanding.

Adding Topics to Routes

agents/_routing.agent.yml
model: "openai-4o-mini"
type: routing
description: "Main routing agent for data analysis"

routes:
  # Include specific topics
  - "semantics/topics/ecommerce_analytics.topic.yml"
  - "semantics/topics/sales.topic.yml"
  
  # Or use glob patterns to include all topics
  - "semantics/topics/*.topic.yml"
  
  # Mix with other route types
  - "agents/specialized_analyst.agent.yml"
  - "workflows/*.workflow.yml"

route_fallback: agents/default.agent.yml

Complete Routing Example

agents/_data_router.agent.yml
model: "openai-4o-mini"
type: routing
description: "Intelligent data analysis router"

system_instructions: |
  You are a routing agent for data analysis tasks.
  
  Route to semantic topics for:
  - Metrics and KPI queries
  - Standard business questions
  - Data exploration requests
  
  Route to specialized agents for:
  - Complex analysis requiring custom logic
  - Tasks needing multiple data sources
  
  Route to workflows for:
  - Automated reporting
  - Multi-step processes

routes:
  # Semantic layer topics
  - "semantics/topics/sales.topic.yml"
  - "semantics/topics/marketing.topic.yml"
  - "semantics/topics/finance.topic.yml"
  - "semantics/topics/operations.topic.yml"
  
  # Specialized agents
  - "agents/data_scientist.agent.yml"
  - "agents/report_generator.agent.yml"
  
  # Automated workflows
  - "workflows/daily_reports.workflow.yml"
  - "workflows/data_quality.workflow.yml"

route_fallback: agents/general_assistant.agent.yml

reasoning:
  effort: low

Best Practices

Agent Usage

  • Provide clear system instructions on when to use semantic queries
  • Include multiple topics for agents that span business domains
  • Let the agent decide which dimensions and measures to use based on the question

Workflow Usage

  • Use semantic queries for repeatable analysis patterns
  • Leverage variables for dynamic filtering
  • Chain semantic queries with agent tasks for insights generation

Routing Agent Usage

  • Include semantic topics alongside specialized agents and workflows
  • Use glob patterns to automatically include all topics
  • Provide clear routing instructions in system_instructions
  • Configure appropriate fallback routes

Performance

  • Use limit to constrain result set size when appropriate
  • Be selective with dimensions—only include what you need
  • Consider adding indexes on frequently filtered columns
  • Use default_filters in topics for common business rules

Error Handling

  • Test your semantic queries before deploying to production
  • Handle cases where queries return no results
  • Provide meaningful error messages to users
  • Monitor query performance and optimize as needed