# Evaluations

An **evaluation** in Stratix is a discrete run that produces results. It's the unit of measurement.

## Anatomy

Every evaluation has:

* **Model(s)** — what's being evaluated
* **Dataset / benchmark** — the inputs (and, for benchmarks, the expected outputs)
* **Scoring config** — the scorers and/or judges
* **Configuration metadata** — a name, description, tags, owner
* **A run history** — every time you (re-)run, a new result is captured

## Public vs private

* **Public evaluations** — visible on `stratix.layerlens.ai`, citable, contributed by LayerLens or partners (2,000+)
* **Private evaluations** — scoped to your org in `stratix.layerlens.ai`

Both flow through the same evaluation engine. The only difference is visibility and tenant scoping.

## What a result looks like

A result has:

* **Top-line score(s)** — one per scoring dimension
* **Per-row results** — for each input row: input, output, expected (if any), per-scorer/judge verdicts
* **Cost and latency** — per row and rolled up
* **Status** — running, completed, failed, partial
* **Run metadata** — model used, scoring config used, timestamp, owner

## Standard, comparison, trace, agentic

Four shapes of evaluation:

### Standard

One model, one dataset, scoring config. Most common.

### Comparison (compare-models)

Two or more models, one dataset, scoring config. Side-by-side scores.

### Trace evaluation

A trace set as the input, scoring config applied. See [Traces and spans](/6.-build-wire-your-code/traces-and-spans.md).

### Agentic evaluation

A trace evaluation where the criteria explicitly mix natural-language assertions, deterministic rules, and LLM judges. See [Agentic evaluation](/8.-evaluate-score-the-outputs/agentic-evaluation.md).

All four are stored as evaluations in the same surface and share the same result shape.

## Re-running

Re-running an evaluation creates a new run record under the same evaluation. Score-over-time charts use the run history.

## Where to next

* [Traces and spans](/6.-build-wire-your-code/traces-and-spans.md)
* [Judges](/8.-evaluate-score-the-outputs/judges-1.md)
* [Scorers](/8.-evaluate-score-the-outputs/scorers-1.md)
* [Evaluation spaces](/8.-evaluate-score-the-outputs/evaluation-spaces.md)
* [Stratix Premium — Evaluations](/8.-evaluate-score-the-outputs/evaluations.md)


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.layerlens.ai/8.-evaluate-score-the-outputs/evaluations-1.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
