Evaluation

List test sets

get

List all test sets for a context within the user's workspace.

Requires Studio app access.

Authorizations
AuthorizationstringRequired

JWT access token for authentication. Swagger UI automatically adds 'Bearer ' prefix.

cookie
authstringOptionalDefault: ""
Query parameters
context_idstringRequired

Context ID to filter by

pageinteger · min: 1Optional

Page number

Default: 1
page_sizeinteger · min: 1 · max: 100Optional

Items per page

Default: 20
searchany ofOptional

Search by name

stringOptional
or
nullOptional
sort_byany ofOptional

Sort by field (name, test_case_count, last_run_at, last_run_pass_rate)

stringOptional
or
nullOptional
sort_orderany ofOptional

Sort order (asc, desc)

Default: desc
stringOptional
or
nullOptional
Header parameters
authorizationany ofOptionalDefault: ""
stringOptional
or
nullOptional
X-Internal-Serviceany ofOptional
stringOptional
or
nullOptional
Responses
chevron-right
200

Successful Response

application/json
get
/v1/evaluation/test-sets

Create test set

post

Create a new test set.

Validates that context_id exists and belongs to the user's workspace.

Authorizations
AuthorizationstringRequired

JWT access token for authentication. Swagger UI automatically adds 'Bearer ' prefix.

cookie
authstringOptionalDefault: ""
Header parameters
authorizationany ofOptionalDefault: ""
stringOptional
or
nullOptional
X-Internal-Serviceany ofOptional
stringOptional
or
nullOptional
Body

Request to create a new test set

context_idstringRequired

Context ID (prompt statement)

namestring · min: 1 · max: 255Required

Test set name

descriptionany ofOptional

Test set description

stringOptional
or
nullOptional
modestring · enumOptional

Execution mode

Default: scenarioPossible values:
languagestringOptional

Language for evaluation prompts

Default: en
Responses
post
/v1/evaluation/test-sets

Get test set details

get

Get details of a specific test set.

Returns 404 if not found or user doesn't have access.

Authorizations
AuthorizationstringRequired

JWT access token for authentication. Swagger UI automatically adds 'Bearer ' prefix.

cookie
authstringOptionalDefault: ""
Path parameters
test_set_idstringRequired
Header parameters
authorizationany ofOptionalDefault: ""
stringOptional
or
nullOptional
X-Internal-Serviceany ofOptional
stringOptional
or
nullOptional
Responses
chevron-right
200

Successful Response

application/json
get
/v1/evaluation/test-sets/{test_set_id}

Update test set

put

Update an existing test set.

Returns 404 if not found or user doesn't have access.

Authorizations
AuthorizationstringRequired

JWT access token for authentication. Swagger UI automatically adds 'Bearer ' prefix.

cookie
authstringOptionalDefault: ""
Path parameters
test_set_idstringRequired
Header parameters
authorizationany ofOptionalDefault: ""
stringOptional
or
nullOptional
X-Internal-Serviceany ofOptional
stringOptional
or
nullOptional
Body

Request to update a test set

nameany ofOptional

Test set name

string · min: 1 · max: 255Optional
or
nullOptional
descriptionany ofOptional

Test set description

stringOptional
or
nullOptional
modeany ofOptional

Execution mode

string · enumOptional

Test set execution modes

Possible values:
or
nullOptional
languageany ofOptional

Language for evaluation prompts

stringOptional
or
nullOptional
Responses
chevron-right
200

Successful Response

application/json
put
/v1/evaluation/test-sets/{test_set_id}

Delete test set

delete

Delete a test set (cascade deletes test cases and runs).

Returns 404 if not found or user doesn't have access.

Authorizations
AuthorizationstringRequired

JWT access token for authentication. Swagger UI automatically adds 'Bearer ' prefix.

cookie
authstringOptionalDefault: ""
Path parameters
test_set_idstringRequired
Header parameters
authorizationany ofOptionalDefault: ""
stringOptional
or
nullOptional
X-Internal-Serviceany ofOptional
stringOptional
or
nullOptional
Responses
delete
/v1/evaluation/test-sets/{test_set_id}

No content

List test cases

get

List all test cases for a test set within the user's workspace.

Sorted by order column by default (ascending for scenario mode sequencing). Returns 404 if test set not found or user doesn't have access.

Filter format: {"test_method": ["exact_match", "similarity"], "question": "search term"}

Authorizations
AuthorizationstringRequired

JWT access token for authentication. Swagger UI automatically adds 'Bearer ' prefix.

cookie
authstringOptionalDefault: ""
Path parameters
test_set_idstringRequired
Query parameters
pageinteger · min: 1Optional

Page number

Default: 1
page_sizeinteger · min: 1 · max: 100Optional

Items per page

Default: 20
filtersany ofOptional

JSON filters (test_method: array, question: string)

stringOptional
or
nullOptional
sort_byany ofOptional

Sort by field (question, expected_response, test_method, order)

stringOptional
or
nullOptional
sort_orderany ofOptional

Sort order (asc, desc)

Default: asc
stringOptional
or
nullOptional
Header parameters
authorizationany ofOptionalDefault: ""
stringOptional
or
nullOptional
X-Internal-Serviceany ofOptional
stringOptional
or
nullOptional
Responses
chevron-right
200

Successful Response

application/json
get
/v1/evaluation/test-sets/{test_set_id}/cases

Create test case

post

Create a new test case in a test set.

Automatically increments the test set's test_case_count.

Authorizations
AuthorizationstringRequired

JWT access token for authentication. Swagger UI automatically adds 'Bearer ' prefix.

cookie
authstringOptionalDefault: ""
Path parameters
test_set_idstringRequired
Header parameters
authorizationany ofOptionalDefault: ""
stringOptional
or
nullOptional
X-Internal-Serviceany ofOptional
stringOptional
or
nullOptional
Body

Request to create a test case

questionstring · min: 1Required

Test question/prompt

expected_responsestring · min: 1Required

Expected response

test_methodstring · enumOptional

Evaluation method

Default: semantic_qualityPossible values:
passing_scorenumber · max: 1Optional

Passing threshold (0.0-1.0)

Default: 0.7
keywordsany ofOptional

Keywords for keyword_match method

string[]Optional
or
nullOptional
orderintegerOptional

Order in scenario mode

Default: 0
file_urlany ofOptional

URL to attached file (image, document, etc.)

stringOptional
or
nullOptional
file_mime_typeany ofOptional

MIME type of attached file

stringOptional
or
nullOptional
Responses
post
/v1/evaluation/test-sets/{test_set_id}/cases

Update test case

put

Update an existing test case.

Returns 404 if not found or user doesn't have access.

Authorizations
AuthorizationstringRequired

JWT access token for authentication. Swagger UI automatically adds 'Bearer ' prefix.

cookie
authstringOptionalDefault: ""
Path parameters
test_set_idstringRequired
case_idstringRequired
Header parameters
authorizationany ofOptionalDefault: ""
stringOptional
or
nullOptional
X-Internal-Serviceany ofOptional
stringOptional
or
nullOptional
Body

Request to update a test case

questionany ofOptional

Test question/prompt

string · min: 1Optional
or
nullOptional
expected_responseany ofOptional

Expected response

string · min: 1Optional
or
nullOptional
test_methodany ofOptional

Evaluation method

string · enumOptional

Methods for evaluating test case responses

Possible values:
or
nullOptional
passing_scoreany ofOptional

Passing threshold (0.0-1.0)

number · max: 1Optional
or
nullOptional
keywordsany ofOptional

Keywords for keyword_match method

string[]Optional
or
nullOptional
orderany ofOptional

Order in scenario mode

integerOptional
or
nullOptional
file_urlany ofOptional

URL to attached file (image, document, etc.)

stringOptional
or
nullOptional
file_mime_typeany ofOptional

MIME type of attached file

stringOptional
or
nullOptional
Responses
chevron-right
200

Successful Response

application/json
put
/v1/evaluation/test-sets/{test_set_id}/cases/{case_id}

Delete test case

delete

Delete a test case.

Automatically decrements the test set's test_case_count.

Authorizations
AuthorizationstringRequired

JWT access token for authentication. Swagger UI automatically adds 'Bearer ' prefix.

cookie
authstringOptionalDefault: ""
Path parameters
test_set_idstringRequired
case_idstringRequired
Header parameters
authorizationany ofOptionalDefault: ""
stringOptional
or
nullOptional
X-Internal-Serviceany ofOptional
stringOptional
or
nullOptional
Responses
delete
/v1/evaluation/test-sets/{test_set_id}/cases/{case_id}

No content

Upload file for test case

post

Upload a file for evaluation test cases.

Returns the file URL and MIME type to be used when creating/updating test cases. Files are stored in S3 under the evaluation/ prefix.

Authorizations
AuthorizationstringRequired

JWT access token for authentication. Swagger UI automatically adds 'Bearer ' prefix.

cookie
authstringOptionalDefault: ""
Header parameters
authorizationany ofOptionalDefault: ""
stringOptional
or
nullOptional
X-Internal-Serviceany ofOptional
stringOptional
or
nullOptional
Body
filestring · binaryRequired
Responses
chevron-right
200

Successful Response

application/json
Responseany
post
/v1/evaluation/files

No content

Bulk upload test cases from CSV

post

Bulk upload test cases from CSV file.

CSV format: question, expected_response, test_method (optional), passing_score (optional), keywords (optional), order (optional)

Returns count of created cases and any parsing errors.

Authorizations
AuthorizationstringRequired

JWT access token for authentication. Swagger UI automatically adds 'Bearer ' prefix.

cookie
authstringOptionalDefault: ""
Path parameters
test_set_idstringRequired
Header parameters
authorizationany ofOptionalDefault: ""
stringOptional
or
nullOptional
X-Internal-Serviceany ofOptional
stringOptional
or
nullOptional
Body
filestring · binaryRequired
Responses
chevron-right
200

Successful Response

application/json
post
/v1/evaluation/test-sets/{test_set_id}/upload-csv

Download CSV template

get

Download a CSV template file for bulk uploading test cases.

The template includes:

  • Required columns: question, expected_response

  • Optional columns: test_method, passing_score, keywords, order

Authorizations
AuthorizationstringRequired

JWT access token for authentication. Swagger UI automatically adds 'Bearer ' prefix.

cookie
authstringOptionalDefault: ""
Header parameters
authorizationany ofOptionalDefault: ""
stringOptional
or
nullOptional
X-Internal-Serviceany ofOptional
stringOptional
or
nullOptional
Responses
chevron-right
200

Successful Response

application/json
Responseany
get
/v1/evaluation/template-csv

No content

Start evaluation run

post

Start a new evaluation run.

Creates a run record and launches background task to execute tests. Returns the run with SSE URL for real-time progress updates.

Authorizations
AuthorizationstringRequired

JWT access token for authentication. Swagger UI automatically adds 'Bearer ' prefix.

cookie
authstringOptionalDefault: ""
Header parameters
authorizationany ofOptionalDefault: ""
stringOptional
or
nullOptional
X-Internal-Serviceany ofOptional
stringOptional
or
nullOptional
Body

Request to start an evaluation run

test_set_idstringRequired

Test set to evaluate

context_idstringRequired

Context to evaluate against

Responses
post
/v1/evaluation/runs

Get evaluation run status

get

Get current status and progress of an evaluation run.

Returns 404 if not found or user doesn't have access.

Authorizations
AuthorizationstringRequired

JWT access token for authentication. Swagger UI automatically adds 'Bearer ' prefix.

cookie
authstringOptionalDefault: ""
Path parameters
run_idstringRequired
Header parameters
authorizationany ofOptionalDefault: ""
stringOptional
or
nullOptional
X-Internal-Serviceany ofOptional
stringOptional
or
nullOptional
Responses
chevron-right
200

Successful Response

application/json
get
/v1/evaluation/runs/{run_id}

Get evaluation results

get

Get all test case results for an evaluation run.

Returns empty list if run has no results yet.

Authorizations
AuthorizationstringRequired

JWT access token for authentication. Swagger UI automatically adds 'Bearer ' prefix.

cookie
authstringOptionalDefault: ""
Path parameters
run_idstringRequired
Query parameters
pageinteger · min: 1Optional

Page number

Default: 1
page_sizeinteger · min: 1 · max: 100Optional

Items per page

Default: 20
Header parameters
authorizationany ofOptionalDefault: ""
stringOptional
or
nullOptional
X-Internal-Serviceany ofOptional
stringOptional
or
nullOptional
Responses
chevron-right
200

Successful Response

application/json
get
/v1/evaluation/runs/{run_id}/results

Export evaluation results as CSV

get

Export evaluation results as downloadable CSV file.

CSV includes all result fields: question, expected_response, actual_response, status, score, reasoning, execution_time_ms, error.

Authorizations
AuthorizationstringRequired

JWT access token for authentication. Swagger UI automatically adds 'Bearer ' prefix.

cookie
authstringOptionalDefault: ""
Path parameters
run_idstringRequired
Header parameters
authorizationany ofOptionalDefault: ""
stringOptional
or
nullOptional
X-Internal-Serviceany ofOptional
stringOptional
or
nullOptional
Responses
chevron-right
200

Successful Response

application/json
Responseany
get
/v1/evaluation/runs/{run_id}/export

No content

Get run history

get

Get recent evaluation runs for a test set.

Returns runs ordered by started_at desc (most recent first).

Authorizations
AuthorizationstringRequired

JWT access token for authentication. Swagger UI automatically adds 'Bearer ' prefix.

cookie
authstringOptionalDefault: ""
Path parameters
test_set_idstringRequired
Query parameters
pageinteger · min: 1Optional

Page number

Default: 1
page_sizeinteger · min: 1 · max: 100Optional

Items per page

Default: 20
Header parameters
authorizationany ofOptionalDefault: ""
stringOptional
or
nullOptional
X-Internal-Serviceany ofOptional
stringOptional
or
nullOptional
Responses
chevron-right
200

Successful Response

application/json
get
/v1/evaluation/test-sets/{test_set_id}/history

Compare two evaluation runs

get

Compare two evaluation runs from the same workspace.

Returns both runs with comparison metrics (pass rate diff, improvement/regression).

Authorizations
AuthorizationstringRequired

JWT access token for authentication. Swagger UI automatically adds 'Bearer ' prefix.

cookie
authstringOptionalDefault: ""
Query parameters
run_astringRequired

First run ID

run_bstringRequired

Second run ID

Header parameters
authorizationany ofOptionalDefault: ""
stringOptional
or
nullOptional
X-Internal-Serviceany ofOptional
stringOptional
or
nullOptional
Responses
chevron-right
200

Successful Response

application/json
get
/v1/evaluation/compare

Last updated