# Pattern: clinical decision support

A clinical-AI assistant answers physician questions: differential diagnosis, treatment options, drug-interaction checks based on patient context. Used at the point of care, the system reads guidelines, returns a recommendation, and cites the evidence.

This pattern shows how to evaluate that system end-to-end on Stratix.

## What's at stake

| Risk dimension                                 | Magnitude                                          | Framework                        |
| ---------------------------------------------- | -------------------------------------------------- | -------------------------------- |
| Patient harm from contraindicated combinations | Adverse drug events at scale of patient encounters | Clinical incident reporting      |
| Per-incident malpractice exposure              | $250K–$5M                                          | Industry settlement bands        |
| FDA Software as a Medical Device action        | Loss of clearance, market withdrawal               | FDA SaMD guidance                |
| Hospital accreditation impact                  | Joint Commission inquiry, conditional status       | TJC sentinel-event protocol      |
| HIPAA exposure on PHI in trace bodies          | Per-violation civil penalties + breach reporting   | HIPAA Privacy and Security Rules |

## The evaluation pattern

A pre- and post-deployment **agentic evaluation** runs over a 50-100 trace curated set drawn from representative encounter shapes (rare-disease cases, multi-morbidity, pediatric vs. geriatric weight-based dosing, contraindicated drug combinations, common edge cases).

**Criteria mix (the 70/20/10 healthcare ratio):**

1. **Deterministic rules (\~70%)**

* Every drug pair the AI mentions is checked against an authoritative interaction database. Missing checks = CRITICAL.
* Every dose mention extracts cleanly and matches the prescribing reference exactly.
* PHI-redaction regex confirms no patient identifiers appear in any logged span.

2. **Natural-language assertions (\~20%)** — the AI's recommendation matches the ground-truth label from the case file.
3. **LLM judges (\~10%)**

* **Faithfulness judge** (GEPA-tuned against ≥50 clinician-labeled examples — multi-class severity output) — every clinical claim is grounded in retrieved guideline content.
* **Reasoning soundness judge** (GEPA-tuned against ≥50 examples) — penalize correct answers via flawed reasoning.

> Don't have labels yet? See [Bootstrap a judge before GEPA](https://github.com/LayerLens/gitbook-full/blob/main/08-evaluate/guides/bootstrap-judges.md) for the week-1 setup.

**Trace shape captured:**

```
trace
├── span: retrieval (guideline_corpus)
│ └── outputs: chunks[]
├── span: llm (answer-synthesis)
│ ├── inputs: question, chunks
│ └── outputs: answer, citations
└── span: post-process (citation-validation)
```

Score retrieval (precision\@k against ground-truth chunks) and synthesis (faithfulness against retrieved chunks) separately.

**Continuous trace evaluation:** the same configuration runs daily on a 5% sample of production traffic. Thresholds wired to in-app and Slack notifications. The `dose-mention` and `drug-interaction-coverage` rules drop to a per-trace alert, not a daily summary.

## Configuration in code

```python
# Python (SDK)
from layerlens import Stratix
client = Stratix()

faithfulness = client.judges.create(
 name="clinical-faithfulness",
 evaluation_goal="Every clinical claim in OUTPUT must be supported by chunks in the retrieved guideline content.",
)

drug_interaction_rule = client.scorers.create_code(
 name="drug-interaction-coverage",
 code="result = check_all_drug_pairs(output, db='rxnorm')",
)

trace_eval = client.trace_evaluations.create(
 trace_set={"tags": {"env": "production", "feature": "clinical-qa"}, "sample_rate": 0.05},
 judges=[faithfulness.id],
 scorers=[drug_interaction_rule.id],
 schedule="daily",
)
result = client.trace_evaluations.wait_for_completion(trace_eval.id)
```

```typescript
// TypeScript (REST — TS SDK on roadmap)
const r = await fetch("https://stratix.layerlens.ai/api/v1/trace-evaluations", {
 method: "POST",
 headers: {
 "X-API-Key": process.env.LAYERLENS_STRATIX_API_KEY!,
 "Content-Type": "application/json",
 },
 body: JSON.stringify({
 trace_set: { tags: { env: "production", feature: "clinical-qa" }, sample_rate: 0.05 },
 judges: [faithfulnessJudgeId],
 scorers: [drugInteractionScorerId],
 schedule: "daily",
 }),
});
const traceEval = await r.json();
```

## What you get

* Pre- and post-deployment evaluations surface contraindicated-combination patterns before patient impact, not 6 weeks into production.
* Each release pin cites the evaluation IDs that gated it — a self-contained audit packet for FDA SaMD and Joint Commission inquiries.
* Continuous trace evaluation catches drift when formulary or guideline updates shift effective knowledge.
* Auditor-ready evaluation evidence is a byproduct, not a separate engineering project.

## Stratix capabilities used

* [Agentic evaluation](/8.-evaluate-score-the-outputs/agentic-evaluation.md) — three criteria types in one configuration
* [Judges with GEPA optimization](/8.-evaluate-score-the-outputs/judges-1.md) — faithfulness and reasoning soundness
* [Custom code graders](https://github.com/LayerLens/gitbook-full/blob/main/08-evaluate/cookbook/custom-code-scorer.md) — drug-interaction database lookup, dose-mention extraction
* [Span-level scoring](https://github.com/LayerLens/gitbook-full/blob/main/08-evaluate/cookbook/span-level-scoring.md) — retrieval and synthesis scored separately
* [Trace evaluations](/8.-evaluate-score-the-outputs/trace-evaluations.md) — continuous on production
* [Notifications](https://github.com/LayerLens/gitbook-full/blob/main/13-reference/sdk-python/notifications.md) — Slack and in-app routing

## Replicate this

* [Industry → Healthcare](/4.2-industry-use-cases/healthcare.md)
* [Healthcare scenarios](https://github.com/LayerLens/gitbook-full/blob/main/04-use-cases/industry/healthcare/scenarios.md) — six clinical-AI scenarios with full eval criteria
* [Healthcare evaluation patterns](https://github.com/LayerLens/gitbook-full/blob/main/04-use-cases/industry/healthcare/eval-patterns.md) — the 70/20/10 rule + GEPA labeling guide
* [Healthcare compliance](https://github.com/LayerLens/gitbook-full/blob/main/04-use-cases/industry/healthcare/compliance.md) — HIPAA / BAA, FDA SaMD, state regs
* [Cookbook: clinical Q\&A safety judge](https://github.com/LayerLens/gitbook-full/blob/main/04-use-cases/industry/healthcare/cookbook/industry-healthcare-clinical-qa.md)


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.layerlens.ai/4.2-industry-use-cases/pattern.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.