Testing
How to write tests
Tests are a core part of oxy. Tests can be written either as a part of
agents
or workflows
.
At present, we support a single type of test, type: consistency
, which
measures the consistency between two results. Within agents, this can be
implemented as follows:
The task_description
field is the question that you want to test the LLM’s
performance on (note: we don’t call this prompt
because we are nesting this
task_description
within a separate prompt that runs the evaluation, so
prompt
in this situation would be ambiguous). n
indicates the number of
times to run the agent to produce a response to the task_description
request.
For workflows, task_description
is not required, but instead a task_ref
value should be provided, as shown below:
The task_ref
field indicates the task
name that is to be tested. No
task_description
is required because the given prompt
will be used for
evaluation.
These tests can be run by running either, for an agent:
Or, for a workflow: