Skip to main content

GOVERNANCE

ZARA Fixtures Dashboard

This document explains how the ZARA Fixtures system works end‑to‑end, how the ZARA Fixtures Dashboard is built, and how to add or modify fixtures safely. It is intended for the systems‑info library and assumes basic familiarity with ZARA, Hecate, and Docusaurus.

1. What the ZARA Fixtures system is

ZARA Fixtures are deterministic, repeatable test cases used to validate that the ZARA LLM:

  • Produces schema‑valid JSON output
  • Does not hallucinate beyond provided context
  • Correctly resolves tables, bundles, and columns
  • Produces actionable, high‑quality engineering instructions

Each fixture simulates a real documentation slice and runs ZARA against it under controlled conditions.

The results are:

  • Persisted as NDJSON history
  • Aggregated into JSON snapshots
  • Visualized in the ZARA Fixtures Dashboard

2. High‑level architecture

MDX slice + hints

run‑fixtures.ts

ZARA LLM (strict JSON schema)

Metrics + flags + derived scores

history.ndjson → history.json

ZARA Fixtures Dashboard (Docusaurus)

Key properties:

  • No golden outputs (no expected.json required)
  • Failures are signal, not regressions
  • All metrics are computed automatically

3. Folder structure

All ZARA fixture logic lives under:

scripts/hecate/zara/

Important subfolders:

fixtures/
01-table-contract/
input.json
excerpt.mdx
02-ambiguity-handling/
03-code-and-admonitions/

fixtures/results/
history.ndjson # append‑only raw history
history.json # aggregated for dashboard

run-fixtures.ts # main runner
llm-generate.ts # ZARA LLM call
zaraOutputSchema.json # strict output schema

Unused or future fixtures should live in:

scripts/hecate/zara/unused-fixtures/

They are intentionally not executed.

4. What a fixture is

A fixture represents one test scenario and consists of:

Required files

input.json

Defines metadata and hints used by ZARA:

{
"fixtureId": "01-table-contract",
"name": "Table contract resolution",
"specId": "input-hub-general",
"needType": "Task",
"headingText": "Signal Registry",
"tableHints": ["Signals table"],
"tableBundles": [ ... ]
}

This file is machine‑only and must always be valid JSON.


excerpt.mdx

A verbatim slice of documentation that ZARA will see.

Rules:

  • Must be valid MDX
  • Should match real docs structure
  • May include tables, code blocks, admonitions
  • No frontmatter (this is not a real page)

Example:

### Object

| Field | Type | Required |
|-------|------|----------|
| id | UUID | yes |

> ⚠️ The `id` must be stable across updates.

This is the single most important input to ZARA.

5. How excerpts should be written

Golden rules

  • Copy from real documentation whenever possible
  • Keep it small but complete
  • Include ambiguity when testing ambiguity handling
  • Include real‑world messiness (notes, warnings, partial tables)

What to test with excerpts

GoalInclude
Table resolutionMultiple tables + references
AmbiguityVague wording, missing constraints
HallucinationThings ZARA must not invent
Code handlingCode blocks + prose around them

6. Running fixtures

Local run

export OPENAI_API_KEY=sk-...
npm run zara:fixtures

Outputs:

  • fixtures/results/history.ndjson
  • Per‑fixture debug artifacts (last‑result.json, last‑output.json)

Nightly run (CI)

  • Runs via GitHub Actions
  • Appends to history
  • Publishes aggregated JSON
  • Fails CI depending on ZARA_FIXTURES_FAIL_ON

Environment controls:

VariableEffect
ZARA_FIXTURES_FAIL_ON=noneNever fail
warnFail on WARN or FAIL
fail (default)Fail only on FAIL

7. Metrics explained

Each fixture run produces:

Metrics

  • schemaValid: JSON schema compliance
  • hallucinations: conservative count
  • clarificationsNeeded: number of clarifying questions
  • actionableBullets: usable instructions
  • tablesDetected / resolved
  • columnsDetected / resolved

Derived

  • bundleResolutionPct
  • columnResolutionPct
  • qualityScore (0‑100)

Flags

  • hasHallucination
  • schemaViolation

Status

StatusMeaning
PASSClean run
WARNAcceptable but degraded
FAILSchema or hallucination failure

8. The ZARA Fixtures Dashboard

The dashboard lives in:

docusaurus/src/components/ZaraFixturesDashboard.tsx

It visualizes:

Left column (66.66%)

  • Quality & Resolution line chart
  • Clarifications bar chart
  • Shared time‑range focus (Brush + slider)

Right columns (16.67% each)

  • Schema Validity heatmap (year view)
  • Hallucinations heatmap (year view)

Bottom

  • Latest snapshot (monospace, system‑style)

All charts are derived directly from history.json.

9. Adding a new fixture (checklist)

  1. Create new folder under fixtures/NN-name/
  2. Add input.json
  3. Add excerpt.mdx
  4. Run npm run zara:fixtures
  5. Verify dashboard renders
  6. Commit fixture folder

⚠️ Never modify history.ndjson by hand.

10. Common failure modes

ProblemCause
Unexpected token #Markdown leaked into JSON output
Schema violationOutput too long / missing field
HallucinationZARA referenced unseen concept
Dashboard emptyhistory.json missing or invalid

All failures are signals, not bugs.

11. Design philosophy (important)

  • Fixtures are observational, not assertive
  • We measure behavior, not exact text
  • ZARA is allowed to evolve
  • Dashboards show trends, not pass/fail gates

This makes the system robust, future‑proof, and model‑agnostic.

12. When to add expected.json (rare)

Only add expected.json if:

  • You need regression locking for a critical behavior
  • You accept higher maintenance cost

By default: do not use expected.json.




GitHub RepoRequest for Change (RFC)