# Browse benchmarks

The Stratix Public **Benchmarks** catalog is the broadest open-browsable collection of LLM benchmarks today. Each benchmark has metadata, methodology notes, and per-model scores.

## Steps

### 1. Open the catalog

Go to [`stratix.layerlens.ai/benchmarks`](https://stratix.layerlens.ai/benchmarks).

### 2. Filter

* **Capability** — reasoning, code, math, multilingual, vision, multi-turn
* **Difficulty** — easy / medium / hard
* **Sample size** — small / medium / large

### 3. Open a benchmark

Click any benchmark card. The page shows:

* Description and methodology
* Sample tasks
* Per-model scores
* Top performers
* Score-history-over-time chart
* Public evaluations that ran this benchmark

### 4. Pick a model and see its score

From the benchmark page, click any model in the score table to jump to that model on this benchmark.

## Verify

You should be able to find MMLU, HumanEval, GSM8K on the first page.

## How to pick a benchmark for your use case

Anti-pattern: "everyone uses MMLU, so I'll use MMLU."

Instead:

* **Reasoning-heavy task?** Look at GSM8K, MATH, ARC.
* **Code?** HumanEval, MBPP, SWE-Bench.
* **Multi-turn dialog?** MT-Bench.
* **Multilingual?** MMLU translated, FLORES.
* **Tool use / agents?** ToolBench, AgentBench.
* **Reading comprehension?** SQuAD, DROP.

## Where to next

* [Browse models](/2.-get-started/browse-models.md)
* [Compare two models](/2.-get-started/compare-two-models.md)
* [Stratix Public — Benchmarks catalog reference](/5.-select-pick-the-model/benchmarks-catalog.md)
* [Concept: Models and benchmarks](/5.-select-pick-the-model/models-and-benchmarks.md)


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.layerlens.ai/2.-get-started/browse-benchmarks.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
