# Scorers

A **scorer** in Stratix is an **LLM-backed grader** — a model plus an evaluation prompt — applied to an evaluation row. Scorers and judges share the LLM-evaluation surface; the two are separated by **lifecycle and usage**, not by implementation kind.

|                | Scorer                                                 | Judge                                         |
| -------------- | ------------------------------------------------------ | --------------------------------------------- |
| Implementation | LLM (model + prompt)                                   | LLM (model + rubric)                          |
| Versioning     | Immutable                                              | Versioned with execution history              |
| Where it runs  | Inside an evaluation run (benchmark or custom dataset) | Standalone, against traces or evaluation runs |
| Optimization   | n/a                                                    | GEPA-tunable against labeled examples         |
| Common use     | Reusable rubric you apply across many evaluations      | Subjective dimension you tune over time       |

Both call an LLM with a prompt and return a score; the distinction is "where in the workflow does it live."

## Anatomy of a scorer

A scorer is a record with:

* **Name** (3–64 chars) and **description** (10–500 chars)
* **Model** — the LLM that runs the prompt
* **Prompt** — the evaluation instructions, including how to interpret inputs and return a score
* Optional **organization\_id** / **project\_id** for tenant-scoped scorers

## When to author a scorer

* You have a scoring rubric you want to **reuse across many benchmarks** in your org.
* The rubric is stable — you don't expect to iterate on it with labeled examples.
* You want to apply it as part of a benchmark evaluation, not against traces.

When the rubric needs labeled-example tuning, version history, or trace-level evaluation, **use a** [**judge**](/8.-evaluate-score-the-outputs/judges-1.md) **instead.**

## Authoring patterns

**Rubric-style prompt**

```
Rate the OUTPUT on a 1–5 scale for factual accuracy against the EXPECTED answer.
5 = every claim correct.
3 = mostly correct with one minor error.
1 = materially wrong.
Return only the integer.

EXPECTED: {{expected}}
OUTPUT: {{output}}
```

**Pass/fail prompt**

```
Does OUTPUT contain a citation that exists in CONTEXT?
Return "yes" or "no" only.

CONTEXT: {{context}}
OUTPUT: {{output}}
```

**Structured-judgment prompt**

```
Evaluate OUTPUT and return a JSON object:
{
 "score": <0–1>,
 "violations": [...],
 "rationale": "..."
}

INPUT: {{input}}
OUTPUT: {{output}}
```

## Scorer composition in evaluations

An evaluation can stack multiple scorers. Each produces a per-row verdict; the evaluation's overall score is configurable:

* **All-pass** — every scorer must pass
* **Any-pass** — at least one scorer must pass
* **Mean** — average score across scorers
* **Custom weighting** — weighted aggregation

## Org-scoped library

Org-scoped scorers are reusable across every benchmark in your workspace. System scorers shipped by LayerLens are available across all orgs.

## Deterministic / code-based graders

Some evaluation needs are inherently deterministic — Flesch-Kincaid grade level, JSON-schema validity, regex match, statistical fairness ratios, citation-existence database lookups. These do **not** fit the Scorer model (which is LLM-prompt-driven). Treat them as separate **code graders**, computed by the evaluation runtime independently from scorers. See [Custom code grader](https://github.com/LayerLens/gitbook-full/blob/main/08-evaluate/cookbook/custom-code-scorer.md) for the pattern.

## Where to next

* [Judges](/8.-evaluate-score-the-outputs/judges-1.md) — for tunable, versioned, trace-level evaluation
* [Stratix Premium — Scorers](/8.-evaluate-score-the-outputs/scorers.md)
* [Evaluations](/8.-evaluate-score-the-outputs/evaluations-1.md)
* [SDK reference — Scorers](/8.-evaluate-score-the-outputs/scorers-1.md)


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.layerlens.ai/8.-evaluate-score-the-outputs/scorers-1.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
