# From Arize

Arize and Stratix overlap on **trace observability + LLM evaluation** but Arize is broader (covering classical ML observability) while Stratix is LLM-evaluation-first. Migration usually means **keeping Arize for classical-ML observability if you have it**, and moving LLM-specific evaluation workflows to Stratix. Most teams complete LLM-cutover in 2–3 weeks running both systems in parallel.

## Concept mapping

| Arize                             | Stratix                                                                                                                                                                                   | Notes                                                                                                      |
| --------------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ---------------------------------------------------------------------------------------------------------- |
| **Space**                         | Project (within Organization)                                                                                                                                                             | Both scope evaluation work                                                                                 |
| **Inference**                     | Trace row                                                                                                                                                                                 | Arize stores per-prediction; Stratix's trace can hold many predictions plus tool calls and retrieval steps |
| **Span (Phoenix trace)**          | Span (Stratix trace)                                                                                                                                                                      | Both follow OpenInference / OpenTelemetry conventions; field names differ slightly                         |
| **Hierarchical trace tree**       | Stratix trace tree                                                                                                                                                                        | One-to-one mapping; Stratix natively models multi-agent handoffs as spans                                  |
| **Dataset**                       | Custom benchmark                                                                                                                                                                          | Stratix benchmarks are versioned and rerunnable                                                            |
| **Evaluator**                     | Choose: **Scorer** (LLM-prompt, reusable across benchmarks) **or** **Judge** (LLM-rubric, versioned, GEPA-tunable) **or** **Code grader** (deterministic check in the evaluation runtime) |                                                                                                            |
| **Phoenix LLM eval**              | Trace evaluation with a [Judge](/8.-evaluate-score-the-outputs/judges-1.md)                                                                                                               |                                                                                                            |
| **Drift monitor**                 | Continuous trace evaluation + threshold alert                                                                                                                                             |                                                                                                            |
| **Embeddings monitor**            | Stratix continuous evaluation pulls trace samples and runs scorers/judges; embedding-drift specifically is a custom code grader pattern today                                             |                                                                                                            |
| **Performance metrics dashboard** | Stratix Home dashboard + per-evaluation metrics                                                                                                                                           |                                                                                                            |
| **Annotation queue**              | Labeling workflow on trace evaluation → feeds GEPA                                                                                                                                        |                                                                                                            |
| **Trace search**                  | Stratix trace search with structured filters                                                                                                                                              |                                                                                                            |

## What does NOT map cleanly

* **Classical-ML observability** (regression, tabular classification, recommender ranking metrics). Stratix is not a classical-ML observability platform. If you use Arize for both ML and LLM monitoring, keep Arize for ML and move LLM-specific workflows to Stratix.
* **Embedding-drift monitors** require a custom code-grader implementation in Stratix today; the canonical Arize embedding-drift dashboards don't have a direct equivalent.
* **Arize Copilot** doesn't have a direct equivalent; the closest is the Stratix [In-app Assistant](/11.-admin/in-app-assistant.md).

## Migration steps (phased cutover)

### Phase 1 — Inventory (Day 1)

1. List every Arize **space**, **dataset**, **evaluator**, and **drift monitor** that touches LLM workloads.
2. For each evaluator, classify: **LLM-judge**, **LLM-prompt scorer**, **code grader**, **embedding-drift** (special-case).
3. Identify drift monitors with active paging — these need the cleanest cutover.

### Phase 2 — Port traces (Days 2–4)

**Trace export shape.** Arize trace exports follow OpenInference conventions (spans with `attributes.llm.input_messages`, `attributes.llm.output_messages`, `attributes.tool.name`, etc.). Stratix's trace schema is also OpenInference-compatible — the mapping is mostly 1:1 with a few field-name normalizations.

1. **Bulk export.** Pull traces from Arize using Phoenix's `client.get_evaluations(...)` and `client.get_spans_dataframe(...)`. Or, if you're using the Phoenix OTLP endpoint, point your OTLP exporter at Stratix directly going forward.
2. **Normalize field names.** A small Python script reshapes OpenInference attribute names to Stratix's expected shape — see the [Stratix trace schema](/13.1-sdk-and-apis/trace-schema.md). The recipe at [Cookbook: backfill traces from logs](/7.-observe-see-whats-happening/backfill-from-logs.md) covers the normalization pattern.
3. **Upload via SDK** in batches of ≤ 50 MB JSONL:

```python
from layerlens import Stratix
client = Stratix()
client.traces.upload_batch("traces-batch-001.jsonl")
```

### Phase 3 — Re-author evaluators (Days 4–10)

**For Phoenix LLM evaluators:**

* Identify the rubric prompt
* Decide: per-row inside an evaluation → **Scorer**; on traces directly → **Judge**
* Author via SDK (`client.scorers.create(...)` or `client.judges.create(...)`) or the dashboard
* If you have ≥ 30 labeled examples, run [GEPA optimization](/9.-improve-tune-the-system/judge-optimization.md) on the judge to push agreement-with-humans up

**For deterministic Arize evaluators (regex match, numeric thresholds, JSON schema validators):**

* Author as **code graders** in the evaluation runtime
* See [Custom code grader recipe](https://github.com/LayerLens/gitbook-full/blob/main/08-evaluate/cookbook/custom-code-scorer.md)

**For drift monitors:**

* Replace with **continuous trace evaluation** — schedule the corresponding judge/scorer to run on production trace samples at a cadence (hourly, daily) with threshold alerts on the resulting score distribution
* See [Cookbook: continuous evaluation](/7.-observe-see-whats-happening/postrelease-continuous.md)

### Phase 4 — Dual-run (Days 11–18)

1. Send the same traces to both Arize and Stratix for one full cycle.
2. Compare per-trace scores; investigate divergence > 5%.
3. Adjust Stratix judges / scorers where divergence is rubric-quality, not data-quality.

### Phase 5 — Cut over (Day 19+)

1. Switch your OTLP / trace-ingestion sink from Arize to Stratix (or run both if you want classical-ML coverage in Arize alongside LLM in Stratix).
2. Move alert/paging configurations.
3. Archive the LLM-specific portions of your Arize space; document the cut date for audit.

## Common cutover gotchas

* **Trace-id collisions.** If you re-import historical Arize traces and also stream new traces, conflicts can occur. Use a namespace prefix (`arize-`) on imported trace IDs to avoid clashes.
* **Field-name normalization.** Arize sometimes uses Phoenix-specific attribute names; Stratix uses the OpenInference baseline. A small mapping table handles the few divergent fields.
* **Embedding-drift dashboards** don't auto-port. Plan to author this as a custom code grader if you depend on it.

## See also

* [Cookbook: backfill traces from logs](/7.-observe-see-whats-happening/backfill-from-logs.md)
* [Cookbook: continuous trace evaluation](/7.-observe-see-whats-happening/postrelease-continuous.md)
* [Custom code grader recipe](https://github.com/LayerLens/gitbook-full/blob/main/08-evaluate/cookbook/custom-code-scorer.md)
* [Stratix trace schema](/13.1-sdk-and-apis/trace-schema.md)
* [Concept: Judges](/8.-evaluate-score-the-outputs/judges-1.md)
* [Concept: Scorers](/8.-evaluate-score-the-outputs/scorers-1.md)


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.layerlens.ai/6.-build-wire-your-code/from-arize.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.