# Evaluations

{% hint style="info" %}
**Available in Stratix Premium.** This surface is part of the logged-in workspace at [stratix.layerlens.ai](https://stratix.layerlens.ai). Stratix Public users can browse the catalog but cannot use this feature.
{% endhint %}

The Evaluations page is where you run, browse, and compare private evaluation runs. It's the most-used surface in Premium.

URL: [`stratix.layerlens.ai/dashboard/evaluations`](https://stratix.layerlens.ai/dashboard/evaluations)

## What you can do

* **Create a new evaluation** — pick a model, dataset, and scoring config
* **Browse past runs** — filter by model, benchmark, scorer, judge, date
* **Re-run** an evaluation as configuration changes
* **Compare** two or more runs side-by-side
* **Drill into a single run** — per-row results, score distribution, latency, cost
* **Export** results to CSV/JSON for downstream analysis

## Creating an evaluation

The new-evaluation flow has 5 steps:

1. **Pick model(s)** — one or more models, including your BYOK custom models
2. **Pick dataset / benchmark** — upload, select from public benchmarks, or pick your private dataset
3. **Pick scoring** — code graders, judges, or both
4. **Preview cost** — Stratix shows worst-case ECU consumption before you run
5. **Run** — queue and watch results stream in

## Browsing past runs

Filters in the sidebar:

* Date range
* Model
* Benchmark / dataset
* Scorer / judge
* Status (queued, running, completed, failed)
* Tags

Each row shows: name, model, benchmark, top-line score, status, ECU consumed, "compare" button.

## Comparing runs

Select 2+ rows and click **Compare**. The comparison view shows:

* Side-by-side score tables
* Per-row deltas (where they ran on the same dataset)
* Cost and latency comparison

## Run a comparison against the public catalog

The compare-models view in Premium is identical in shape to Stratix Public's compare-models — you can compare your BYOK custom model against any public model.

## Where to next

* [First evaluation](/2.-get-started/first-evaluation.md)
* [Models](https://github.com/LayerLens/gitbook-full/blob/main/13-reference/cli/models.md)
* [Benchmarks](https://github.com/LayerLens/gitbook-full/blob/main/13-reference/cli/benchmarks.md)
* [Scorers](/8.-evaluate-score-the-outputs/scorers.md)
* [Judges](/8.-evaluate-score-the-outputs/judges.md)
* [Concept: Evaluations](/8.-evaluate-score-the-outputs/evaluations-1.md)


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.layerlens.ai/8.-evaluate-score-the-outputs/evaluations.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
