# Pattern: automated property valuation

A proptech company or lender uses AI to generate property valuations from comparable sales, property features, listing photos, and market data. Banking partners require fair-housing-compliant valuation accuracy; replicating historical appraisal bias is a non-starter.

This pattern shows how to evaluate AVM accuracy and fair-housing parity together.

## What's at stake

| Risk dimension                                                  | Magnitude                                          | Framework                              |
| --------------------------------------------------------------- | -------------------------------------------------- | -------------------------------------- |
| Fair-housing exposure on neighborhood-level valuation disparity | Per-violation civil penalties; consent-decree cost | Fair Housing Act / HUD enforcement     |
| Banking-partner rejection of an AVM with documented bias        | Loss of integration agreements, revenue impact     | OCC / FRB safety-and-soundness reviews |
| Bad-loan losses from inaccurate valuations                      | $50K–$500K per loan in the worst-tail outcome      | Industry mortgage-loss benchmarks      |
| Brookings-documented appraisal-undervaluation pattern           | $48K average undervaluation in Black neighborhoods | Brookings 2018 / HUD studies           |

## The evaluation pattern

A **fairness-aware evaluation** runs the AVM against a labeled dataset spanning diverse neighborhoods, price ranges, and property types.

1. **Numeric scorer (valuation accuracy)** — AVM estimate within ±5% of appraiser ground truth.
2. **Custom code grader (geographic disparity)** — per-Census-tract or per-MSA disparity in median accuracy. Disparity above the configured tolerance (commonly 1.25× ratio) flags as a regression.
3. **Demographic-correlation scorer** — accuracy disparity must not correlate with protected-class proxies (race demographics, income tier).
4. **Image-condition assessment scorer** — for AVMs that incorporate listing photos, condition-extraction accuracy from images is scored separately.
5. **Comparable-selection rationale judge** (GEPA-tuned against ≥50 appraiser-labeled examples — scored output) — the AI's choice of comparables is defensible against an appraiser's choice.

> Don't have labels yet? See [Bootstrap a judge before GEPA](https://github.com/LayerLens/gitbook-full/blob/main/08-evaluate/guides/bootstrap-judges.md) for the week-1 setup.

**Continuous trace evaluation:** sampled at 1% of production valuations, daily. Disparity trends visible to compliance and risk committees.

## Configuration in code

```python
# Python (SDK)
from layerlens import Stratix
client = Stratix()

geo_disparity = client.scorers.create_code(
 name="geographic-disparity",
 code="""
by_tract = group_by_census_tract(traces, scores)
ratio = max_median(by_tract) / min_median(by_tract)
result = {'passed': ratio <= 1.25, 'disparity_ratio': ratio}
""",
)

valuation_accuracy = client.scorers.create_code(
 name="valuation-accuracy",
 code="result = {'passed': abs(output['estimate'] - expected['appraiser_value']) / expected['appraiser_value'] <= 0.05}",
)

comparable_judge = client.judges.create(
 name="comparable-selection",
 evaluation_goal="Score 1-5: is the AI's choice of comparable sales defensible against an appraiser's selection?",
)

trace_eval = client.trace_evaluations.create(
 trace_set={"tags": {"feature": "avm"}, "sample_rate": 0.01},
 scorers=[geo_disparity.id, valuation_accuracy.id],
 judges=[comparable_judge.id],
 schedule="daily",
)
```

```typescript
// TypeScript (REST)
const r = await fetch("https://stratix.layerlens.ai/api/v1/trace-evaluations", {
 method: "POST",
 headers: {
 "X-API-Key": process.env.LAYERLENS_STRATIX_API_KEY!,
 "Content-Type": "application/json",
 },
 body: JSON.stringify({
 trace_set: { tags: { feature: "avm" }, sample_rate: 0.01 },
 scorers: [geoDisparityId, valuationAccuracyId],
 judges: [comparableJudgeId],
 schedule: "daily",
 }),
});
```

## What you get

* Fair-housing disparity becomes a measurement on the dashboard, not a discovery during a regulatory review.
* Banking-partner integration approvals come faster — the evaluation evidence packet shrinks the diligence timeline.
* Per-comp rationale auditability for borrower-facing valuation explanations.
* Pre- and post-deployment block on any model variant that worsens disparity.

## Stratix capabilities used

* [Custom code graders](https://github.com/LayerLens/gitbook-full/blob/main/08-evaluate/cookbook/custom-code-scorer.md) — geographic-disparity and demographic-correlation
* [Judges with GEPA optimization](/8.-evaluate-score-the-outputs/judges-1.md) — comparable-selection rationale
* [Compare models](/5.-select-pick-the-model/compare-models.md) — variant selection with fair-housing criteria
* [Trace evaluations](/8.-evaluate-score-the-outputs/trace-evaluations.md) — continuous sampled

## Replicate this

**Get started:** [Workflow: Govern](/9.-improve-tune-the-system/workflow.md) describes the cross-team gate this pattern sits under.

* [Industry → Real estate](/4.2-industry-use-cases/real-estate.md)
* [Use case: AI quality gates in CI/CD](/4.1-general-use-cases/ai-quality-gates-cicd.md)
* [Concept: Continuous evaluation](/7.-observe-see-whats-happening/continuous-evaluation.md)


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.layerlens.ai/4.2-industry-use-cases/pattern-9.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
