How to define workflows
tasks
) executed sequentially. Each task
can be either a deterministic command (such as execute_sql
, in which a named
sql query is executed) or an agent given a prompt. These tasks are composed by
passing results from one agent to the input of another — the output of each
task can be accessed with Jinja as {{ name_of_task }}
.
tasks
. Each task has a few common properties:
Component | Description | Type |
---|---|---|
name | Identifier for the task. Output of the task can be referenced as {{name}} . | required |
type | The tool to use for this task. See the following section for possible types. | required |
type: agent
Component | Description | Type |
---|---|---|
agent_ref | The agent to use within the agents directory, referenced by the agent’s name . | required for type: agent |
prompt | The input prompt passed to the agent for this task. | required for type: agent |
type: execute_sql
Component | Description | Type |
---|---|---|
sql_file | The sql file within the data directory to execute | required |
database | The name of the database to execute the query against | required |
type: formatter
template
using the outputs of other tasks
, then passes
the rendered template as output.
Component | Description | Type |
---|---|---|
template | The template to be rendered and passed as output. | required |
type: loop_sequential
Component | Description | Type |
---|---|---|
values | Values to iterate over for each task in the current task’s tasks array. | required |
tasks | Defines the tasks to execute for each value . | required |
values
are accessed within the tasks
of the loop_sequential
task as
<name>.value
, where <name>
is the name of the task. A sample partial config
is shown below:
values
with query resultsvalues
can be seeded with the output from a previous execute_sql
step,
as follows:
type: formatter
task, which can loop through
the resulting outputs and form them into a single string. The output from a
loop_sequential
is an array of dictionaries for each value
, where the keys
for each element of each dictionary is named according to the task
’s’ name
field. These can be accessed by using Jinja, by looping through the {{ <loop_name> }}
variable ({{ loop_through_animals }}
above).
An example of this behavior is shown below:
concurrency
key, with the
value specifying the number of concurrent threads to use.
type: workflow
Component | Description | Type |
---|---|---|
src | Path to the workflow yml file to execute. Relative to the root of the oxy directory. | required |
variables | Variables that are passed through, overriding the sub-workflow’s variables. | optional |
variables
key
here allows for parameterization of these workflows by overriding the
workflow-level variables. This can be particularly useful when embedding a
workflow task into a loop, as follows:
variables
key.
variables
should be specified as a key-value pair, and these
variables can be referenced within task fields by name using Jinja as follows:
langchain
or llama_index
, but they also dramatically reduce the complexity
of the system. You can think of Oxy’s workflow
paradigm as a domain-specific
chain-builder for data workflows, where most (if not all) tasks simply pass
results around between different agents.