# Judge Optimization (GEPA)

{% hint style="info" %}
**Available in Stratix Premium.** This surface is part of the logged-in workspace at [stratix.layerlens.ai](https://stratix.layerlens.ai). Stratix Public users can browse the catalog but cannot use this feature.
{% endhint %}

GEPA optimization tunes a judge's rubric prompt against a labeled ground-truth set. The result is a rubric that better matches your team's actual quality bar — typically a 10-20 percentage-point lift in human-agreement rate.

## What GEPA does

GEPA explores prompt variations, runs each against your labeled examples, and picks the variation with the highest agreement rate.

## Prerequisites

* A built judge (see [Judges](/8.-evaluate-score-the-outputs/judges.md))
* ≥30 labeled examples — input/output pairs with the human verdict you'd want the judge to produce
* ECU credits to run the optimization

## Running GEPA

### From the UI

1. Open the judge in **Premium → Agent Evaluation → Judges**
2. Click **Optimize**
3. Upload or select your labeled examples
4. Configure: number of iterations, target metric (agreement, F1, accuracy)
5. Run

### From the SDK

```python
from layerlens import Stratix
client = Stratix()

opt = client.judge_optimizations.create(
 judge_id="judge_abc",
 labeled_examples_dataset_id="dataset_xyz",
 iterations=20,
)
result = client.judge_optimizations.wait_for_completion(opt.id)
print(f"Agreement before: {result.before}, after: {result.after}")
```

## Reading the result

The optimization result shows:

* **Agreement rate before** — how often the original rubric matched your labels
* **Agreement rate after** — how often the optimized rubric matches
* **Diff** — what changed in the rubric prompt
* **Per-iteration history** — the trajectory of the optimization

## When to re-optimize

* When you add more labeled examples
* When your team's quality bar shifts (you label things differently than before)
* When the judging model is upgraded

## How GEPA fits in agentic evaluation

For agentic evaluations, GEPA-optimized judges keep your subjective bar honest. Out-of-the-box judges drift; tuned judges hold tight to your labels.

## Where to next

* [Concept: Judges](/8.-evaluate-score-the-outputs/judges-1.md)
* [Tutorial: Optimize a judge with GEPA](/9.-improve-tune-the-system/05-gepa-optimize.md)
* [Judges](/8.-evaluate-score-the-outputs/judges.md)


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.layerlens.ai/9.-improve-tune-the-system/judge-optimization.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
