# Mental model

Stratix has five primitives. Once you have these, every feature is composition.

## The five primitives

### 1. Model

A specific LLM (e.g., GPT-5.3, Claude Opus 4.6, Gemini 3.1 Pro). Stratix's catalog has 200+ public models plus your BYOK custom models.

### 2. Benchmark / dataset

A standardized input set + expected outputs. Public benchmarks (MMLU, HumanEval, GSM8K) live in the catalog; private datasets you upload live in your org.

### 3. Scorer

A code grader that takes (input, output, expected) and returns a verdict. Fast, cheap, exact. Use for objective dimensions.

### 4. Judge

An LLM that takes (input, output) and a rubric and returns a verdict. Slow, expensive, fuzzy. Use for subjective dimensions. GEPA-optimize against labels.

### 5. Trace

A record of an AI call (or a chain of calls): inputs, outputs, every span, latencies, costs, errors. The unit of "what did my AI actually do."

## How they compose

A **standard evaluation** is `(model, benchmark, scorers + judges)` → results.

A **trace evaluation** is `(trace_set, scorers + judges)` → results.

An **agentic evaluation** is a trace evaluation where the criteria explicitly mix natural-language assertions, deterministic rules, and judges.

A **comparison** is N evaluations (each on the same benchmark, different models) viewed side-by-side.

A **space** is a saved evaluation configuration you re-run as conditions change.

## The data plane

```
[Model] + [Benchmark/Dataset] → Evaluation engine → Scorers + Judges → Results
 ↓
 (also outputs traces)

[Trace ingest] → [Trace store] → Trace evaluation engine → Scorers + Judges → Results
```

Both flows produce the same shape of result row: per-input verdicts that aggregate into top-line scores.

## Where everything lives

* **Catalog** stores models, benchmarks, public evaluations, public spaces (global, public)
* **Org-scoped storage** stores private models, private benchmarks, private evaluations, judges, scorers, traces, spaces (one tenant per org)
* **Evaluation engine** is shared infrastructure
* **ECU** bills the compute consumed by the engine

## Where to next

* [Models and benchmarks](/5.-select-pick-the-model/models-and-benchmarks.md)
* [Evaluations](/8.-evaluate-score-the-outputs/evaluations-1.md)
* [Traces and spans](/6.-build-wire-your-code/traces-and-spans.md)
* [Judges](/8.-evaluate-score-the-outputs/judges-1.md)
* [Scorers](/8.-evaluate-score-the-outputs/scorers-1.md)
* [Workflow](/1.-introduction/the-stratix-workflow.md)


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.layerlens.ai/8.-evaluate-score-the-outputs/mental-model.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
