> For the complete documentation index, see [llms.txt](https://docs.layerlens.ai/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://docs.layerlens.ai/2.-get-started/first-judge.md).

# Your first judge

A judge is an LLM that scores subjective dimensions — helpfulness, faithfulness, tone, safety — that a code grader can't grade. Building one takes about 15 minutes.

## Prerequisites

* [ ] A Stratix Premium account ([sign up](/2.-get-started/sign-up.md))
* [ ] A clear definition of what you want to grade (write it in plain English first)
* [ ] At least 10 example outputs labeled "good" or "bad" — more is better

## Steps

### 1. Open Judges

Left rail: **Agent Evaluation → Judges**. Click **New judge**.

### 2. Name and describe

* **Name:** e.g., "helpfulness-customer-support"
* **Description:** what this judge grades, in one sentence
* **Output type:** binary (pass/fail), score (e.g., 1-5), or labeled (e.g., "helpful" / "neutral" / "unhelpful")

### 3. Pick a judging model

The model that runs the rubric. Default is a balanced choice; pick a stronger model for harder rubrics.

### 4. Write the rubric

The rubric is the prompt the judging model uses. Best practices:

* State the dimension you're grading
* Describe what "good" looks like with examples
* Describe what "bad" looks like with examples
* Show the output format you want back

The Premium UI provides a starter template you can adapt.

### 5. Test on a few examples

Paste 3-5 sample outputs and run the judge. Read the verdicts. If they don't match your intuition, iterate the rubric.

### 6. Optional: GEPA-optimize

If you have ≥30 labeled examples, run **GEPA optimization**. Stratix tunes the rubric to better match your labels. Agreement with humans typically rises 10-20 percentage points.

[More about GEPA optimization](/8.-evaluate-score-the-outputs/judges-1.md#judge-optimization-gepa)

### 7. Save

The judge is now reusable in any evaluation, trace evaluation, or agentic evaluation in your org.

## Verify

You should be able to apply your judge to a single sample input/output and get back a verdict.

## What to try next

* **Apply the judge to your first evaluation.** Open the eval, add the judge, rerun.
* **Apply the judge to a trace evaluation.** Real production data, judged.
* **Build a second judge** for a different dimension. Stratix evaluations let you stack many judges.

## Where to next

* [Tutorial: Build your first judge](/8.-evaluate-score-the-outputs/02-first-judge.md)
* [Tutorial: Optimize a judge with GEPA](/9.-improve-tune-the-system/05-gepa-optimize.md)
* [Concept: Judges](/8.-evaluate-score-the-outputs/judges-1.md)
* [Stratix Premium — Judges](/8.-evaluate-score-the-outputs/judges.md)


---

# Agent Instructions
This documentation is published with GitBook. GitBook is the documentation platform designed so that both humans and AI agents can read, navigate, and reason over technical content effectively. Learn more at gitbook.com.

## Querying This Documentation
If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter, and the optional `goal` query parameter:

```
GET https://docs.layerlens.ai/2.-get-started/first-judge.md?ask=<question>&goal=<endgoal>
```

`ask` is the immediate question: it should be specific, self-contained, and written in natural language.
`goal` is optional and describes the broader end goal you are ultimately trying to accomplish on behalf of the user. GitBook uses it to tailor the answer towards what is most useful for that goal.

The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.