Skip to main content

Transferring Text to MDX

0. IMPORTANT: Update the frontmatter of all engines!

Add the following to the frontmatter (example data):

---
id: AAE
title: AAE Engine
description: >-
Autonomous validation, consistency testing, and issuance of digital assurance
proofs.
slug: /micro-engines/aae-engine
hub: computation-hub-calcs
omr_module_ids:
- computation-hub
mice:
meid: MEID_ASRE_AAE
category: VALI
domain: assurance
input_types:
- json
- registry
- api_ref
supported_modes:
- validation
- signal
- scoring
api:
route: /api/mice/MEID_CALC02
method: POST
metric_types_supported:
- assurance.proof.hash_parity
- assurance.governance.approval_validity
- assurance.temporal.continuity
- assurance.trust.anomaly
- assurance.statement
source_map_ref: EngineAliasMap
source_file: /computation-hub-calcs/micro-engines/aae-engine.mdx
sidebar_label: Autonomous Assurance Engine
sidebar_position: 28
doc_type: spec
status: review
legacy_manual_ref: 107.27.
version: 0.1.0
owners:
- cto@viroway.com
last_updated: 2025-12-06T00:00:00.000Z
audience:
- internal-cto
- internal-engineering-core
- internal-management
- external-dev-core
allowed_users:
- pedersen@viroway.com
tags:
- mice
- assurance
- validation
- trust
- governance
- meta-signal
- tier-0
security_level: critical
classification:
- Assurance-Engine
jira:
epic: ZYZ-474
---

Empty (for easier copying):

---
id:
title:
description: >-
...
slug:
hub: ...
omr_module_ids:
- ...
mice: # MUST BE UPDATED FOR EACH MICE!!!!!!!!!!!!
meid: ...
category: CALC
# CALC | VALI | TAGG | AGGR | TRANS | SCEN | SCORE | LINK | ALERT | META

# active | experimental | deprecated | retired

domain: emissions
# emissions | energy | climate_risk | biodiversity | water | pollution
# materials | supply_chain | products | finance | taxonomy
# social | governance | compliance | assurance | meta

input_types:
- json
# json | csv | xlsx | xml | api_ref | registry

supported_modes:
- calculation
# calculation | validation | classification | aggregation
# projection | scoring | signal

api:
route: /api/mice/MEID_CALC02
# E.g. "/api/mice/MEID_CALC02_v1"
method: POST
# POST | GET | PUT | PATCH

metric_types_supported:
- ghg.scope3.cat6.business_travel
# E.g. "ghg.scope3.cat6.business_travel"
source_map_ref: # optional: ties to the “Engine Alias Map”
source_file: /.../....mdx
sidebar_label: ...
sidebar_position:
doc_type:
status:
legacy_manual_ref: N/A
version: 0.1.0
owners:
- cto@viroway.com
last_updated: 2026-xx-xxT00:00:00.000Z
audience:
- internal-cto
-
-
-
allowed_users:
- pedersen@viroway.com
tags:
-
-
-
-
-
-
-
security_level:
classification:
-
jira:
epic:
---

Examples of usage:

Engine Metadatareview

  • ID: AGGR
  • Code: MICE_AGGR
  • Type: micro_engine
  • Name: Aggregation Engines
  • Version: 0.1.0
Operational Metadata
{
  "id": "AGGR",
  "engine_kind": "micro_engine",
  "title": "Aggregation Engines",
  "name": "Aggregation Engines",
  "code": "MICE_AGGR",
  "hub": "computation-hub-calcs",
  "owner": "computation-hub-calcs",
  "omr_module_ids": [
    "TBD"
  ],
  "status": "review",
  "version": "0.1.0",
  "tags": [
    "mice",
    "aggr"
  ],
  "audience": [
    "internal-cto",
    "internal-engineering-core",
    "internal-management",
    "external-dev-core"
  ],
  "slug": "/mice-types/aggregation-engines",
  "source_file": "/computation-hub-calcs/micro-engines/mice-types/aggregation-engines.mdx",
  "lifecycle": {
    "status": "review",
    "semver": "0.1.0"
  }
}
Graph View

AGGR depends on:

No related engines declared.


See also the Engines Registry.

1. Do hybrid batch work

👉 Migrate content in batches of ~5–8 chapters, and immediately do a “linking & grounding pass” for that batch before moving on.

That gives you:

  • Speed (batching)
  • Accuracy (context still fresh)
  • Structural integrity (foreign keys, CMIDs, signals don’t drift)

Avoid both extremes:

  • ❌ Linking everything immediately (too slow)
  • ❌ Migrating everything first (you’ll forget intent)

Why This Matters in The System (not generic docs) The manual is not prose-heavy documentation. It’s a knowledge graph with:

  • CMIDs
  • SSSR signals
  • Tables
  • Engines
  • Policies
  • Evidence chains
  • Search + Ask ZARA as consumers

That means linking is not cosmetic—it defines semantics.

If linking is you deferred too long:

  • You’ll lose why a table existed
  • CMIDs will drift from intent
  • You’ll re-read chapters multiple times
  • You’ll “paper over” missing structure instead of fixing it cleanly

2. The Optimal Workflow (Proven on Complex Manuals)

Phase 1 — Content Migration (Fast, Clean)

For each chapter:

  • Move text over
  • Light cleanup only (headings, lists, clarity)
  • DO NOT over-link
  • Add placeholders, not links

Example placeholders:

<!-- ZAYAZ-TODO: STRUCTURE -->

<!-- ZAYAZ-TODO: LINK-SSSR -->

<!-- LINK: POLICY -->

This keeps momentum high.

Phase 2 — Immediate Grounding Pass (Critical)

After 5–8 chapters, stop and do a grounding pass: For each chapter:

  • Identify:
  • CMIDs referenced
  • Tables implied
  • Signals used
  • Engines involved
  • Add:
  • <TableSignals />
  • <AuditEvidenceLink />
  • Cross-links
  • Foreign-key references

Because:

  • You still remember why things were written
  • You can normalize patterns across chapters
  • You catch inconsistencies early

Phase 3 — Structural Normalization (Light, Fast)

After grounding the batch:

  • Align naming
  • Ensure consistent CMID usage
  • Check signal/table reuse
  • Update registries if needed

This is where the system starts to self-reinforce.

3. A Concrete Rule of Thumb

Use this heuristic:

If you can still explain the chapter to someone else without re-reading it, you can still link it correctly.

Once you can’t, linking quality drops sharply.

That usually happens after ~8 chapters.

4. Practical Guardrails (Very Important)

4.1. Never leave a chapter “structurally ambiguous”

Even if links aren’t added yet, leave markers:

<!-- ZAYAZ-TODO: STRUCTURE -->

You can later grep for these. (See ch 5.)

4.2. Treat tables as first-class citizens

If a chapter uses data:

  • Either link the table now
  • Or explicitly mark:

<!-- ZAYAZ-TODO: DATA-SOURCE -->

Silence is dangerous.

4.3. CMIDs: batch-resolve, don’t guess

When migrating:

  • Write the concept in text
  • Resolve to CMID during grounding pass

This avoids premature CMID fragmentation.

4.4. Why This Is the Right Pace?

We’ve already built:

  • Registries
  • Search
  • Evidence links
  • Live graphs
  • Asset linking
  • Audience gating

That means structure work compounds.

Doing it in disciplined batches lets:

  • Search improve incrementally
  • Ask ZARA get smarter as we go
  • Errors surface early, not at the end

5. Standardized Markers

To manage deferred structure work we use specific markers in the text. Think of those comments as first-class TODO markers and reliably surface them later.

HTML comments like:

MarkerMeaning
<!-- ZAYAZ-TODO: STRUCTURE -->Needs structural review
<!-- ZAYAZ-TODO: ADD-CMID -->Metric not yet canonical
<!-- ZAYAZ-TODO: ADD-FK -->Foreign key / table relation missing
<!-- ZAYAZ-TODO: LINK-SSSR -->Signal registry link missing
<!-- ZAYAZ-TODO: DATA-SOURCE -->Underlying data not identified
<!-- ZAYAZ-TODO: EVIDENCE -->Audit evidence path missing
<!-- ZAYAZ-TODO: POLICY -->Connect/link to Policy.

are:

  • ✅ Ignored by Docusaurus rendering
  • ✅ Preserved in MDX
  • ✅ Searchable via scripts / grep
  • ✅ Non-destructive (won’t break builds)
  • ✅ Version-controllable

Optinally add a message:

`<!-- ZAYAZ-TODO: ADD-FK | Link dim_countries.iso_cc_id -> fact_emissions.country_id -->`

5.1. How to Get a List Later (Multiple Options)

Option 1 — Simple grep (fastest) From repo root:

grep -R "<!-- STRUCTURE REVIEW REQUIRED -->" content/

Or all structural markers at once:

grep -R "<!-- ZAYAZ-TODO: " content/ | grep -E "STRUCTURE|ADD-CMID|ADD-FK|LINK-SSSR|DATA-SOURCE|EVIDENCE"

Example output:

content/computation-hub-calcs/rif/risk-calibration.mdx:<!-- ZAYAZ-TODO: ADD-CMID -->
content/policies/env/climate-change.mdx:<!-- ZAYAZ-TODO: ADD-FK -->

Clean extract:

grep -R "<!-- ZAYAZ-TODO:" content/

This is future-proof for:

  • Building a report
  • Turning them into GitHub issues
  • Showing a “documentation completeness dashboard”

6. Search, Ask ZARA and ToDo — How to Use

Use the search field to find relevant ZAYAZ documentation using semantic search.

You can:

  • Search by concepts, not just exact words

e.g. carbon passport lifecycle

  • Filter by section context automatically (engine, module, appendix)
  • Search for work-in-progress items using TODO markers:
  • todo → all documents with open TODOs
  • todo:STRUCTURE → only structure-related TODOs
  • todo:STRUCTURE,ADD_FK → documents matching all listed tags

Search results return the most relevant sections, not entire documents.

6.2. Ask ZARA

Ask ZARA is an AI assistant grounded only in the indexed ZAYAZ documentation.

You can:

  • Ask technical or conceptual questions
“How does FIRM model transition risk?”
  • Ask about incomplete areas
“What is missing in the ESG signal registry?”
  • Ask about TODOs and gaps
>“What STRUCTURE TODOs exist in the AI Intelligence Layer?”

ZARA will:

  • Cite relevant document sections when possible
  • Explicitly say when information is missing or incomplete
  • Never invent undocumented behavior
User queryResult
todoAll docs with any TODO
todo:anyAll docs with any TODO
todo:STRUCTUREOnly docs with <!-- ZAYAZ-TODO: STRUCTURE -->
todo:STRUCTURE,ADD_FKDocs that contain both tags
todo + normal words(Intentionally not allowed — keeps syntax strict)
Normal questionFalls back to semantic search

6.3. Best Practice

  • Use Search to explore and scan
  • Use Ask ZARA to synthesize, explain, or identify gaps
  • Use <!-- ZAYAZ-TODO: TAG --> markers in docs to surface future work automatically

6.4. Search Syntax (Operators)

OperatorExampleMeaning
todotodoReturn any doc section that contains at least one  marker
todo:TAGtodo:STRUCTUREOnly docs that have the TODO tag STRUCTURE
todo:TAG1,TAG2todo:STRUCTURE,ADD_FKDocs that contain all listed TODO tags
CMID lookupCMID-ZARA-00001Finds pages/sections where the CMID was indexed
Slug scopefilterSlugPrefix="/ai-intelligence-layer"API filter to restrict search to a subtree (UI can expose this later)
Type scopefilterSourceType="mdx"API filter to limit to MDX vs JSON vs Excel docs

Ensure there is a corresponding associated-files directory for all MDX files

For every MDX file under:

/workspaces/zayaz-docs/content/

we want to ensure there is a corresponding directory under:

/workspaces/zayaz-docs/code/associated-files/

with:

  • the same relative path
  • same filename without .mdx
  • containing a .gitkeep

Path mapping rule (canonical)

content/computation-hub-calcs/micro-engines/ewc-hazard-classification.mdx

code/associated-files/computation-hub-calcs/micro-engines/ewc-hazard-classification/
└── .gitkeep
note

If you forget to add the folder, use the following script to automate the creation of the folder(s).

Run this from:

/workspaces/zayaz-docs

Script

#!/bin/bash

CONTENT_ROOT="content"
ASSOCIATED_ROOT="code/associated-files"

EXCLUDE_DIRS=(
"system"
"system-info"
"templates"
"under-development"
)

# Build find exclusion arguments
EXCLUDES=""
for d in "${EXCLUDE_DIRS[@]}"; do
EXCLUDES="$EXCLUDES -path ./$CONTENT_ROOT/$d -prune -o"
done

# Find all .mdx files respecting exclusions
eval find "./$CONTENT_ROOT" $EXCLUDES -type f -name "*.mdx" -print | while read -r mdx; do
# Remove leading content/
rel_path="${mdx#./$CONTENT_ROOT/}"

# Remove .mdx extension
base_path="${rel_path%.mdx}"

# Target directory
target_dir="./$ASSOCIATED_ROOT/$base_path"

# Create directory only if it does not exist
if [ ! -d "$target_dir" ]; then
echo "Creating: $target_dir"
mkdir -p "$target_dir"
touch "$target_dir/.gitkeep"
fi
done

What this script guarantees

  • Existing directories are untouched
  • Existing .gitkeep files are untouched
  • Only missing directories are created
  • Mirrors the content structure exactly
  • Safe for Git history
  • Safe for CI
  • Safe for repeated runs

Insert the Tagging Info in Frontmatters of Engines:

description: "..." # REQUIRED: one sentence, decision-grade, what it does

engine: # REQUIRED for all engines (non-MICE)
eid: ... # OPTIONAL (only if the engine is executable/callable)
meid: ... # REQUIRED: ENGINE category (see list below)
category: ... # REQUIRED: ENGINE category (see list below)

domain: ... # REQUIRED (same domain vocabulary as MICE)

input_types: # OPTIONAL but recommended
- json # json | csv | xlsx | xml | api_ref | registry

supported_modes: # REQUIRED: what the engine does
- routing # examples below

api: # OPTIONAL but recommended if callable
route: /api/engine/...
method: POST # POST | GET | PUT | PATCH

capabilities: # OPTIONAL: helps architecture + diagrams
provides:
- ...
consumes:
- ...

emits_signals: # OPTIONAL: which USO/CSI families it emits
uso:
- ...
csi:
- ...

source_map_ref: EngineAliasMap # OPTIONAL (same as MICE)

Engine category (fixed enum)

Use one of these so we don’t invent new categories per page:

  • REGISTRY (registries, canonical stores, lookups)
  • ROUTING (dispatch, orchestration, routing, resolution)
  • VALIDATION (validation, integrity, QA, consistency checks)
  • GOVERNANCE (RBAC, policy enforcement, approvals)
  • AUDIT (ledger, evidence, lineage, replay, tamper detection)
  • TRANSLATION (localization, translation, mapping)
  • EXPORT (XBRL/iXBRL packaging, converters)
  • AGENT_RUNTIME (agent execution/supervision runtime)
  • SIMULATION (Monte Carlo, Bayesian scenario engines)
  • COMPUTE (general compute engines that aren’t MICE)
  • INTEGRATION (connectors, third-party services)

Engine supported_modes (examples)

Pick what fits; multiple options can be used:

  • routing
  • orchestration
  • registry
  • validation
  • audit
  • authorization
  • identity
  • translation
  • export
  • simulation
  • scoring
  • integration

EID:

  • EID_SYS_* for SIS/platform engines (SSSR, RBAC, ALTD)
  • EID_MOD_<MOD>_* for module engines (e.g., EID_MOD_ZARA_TRPG)
  • EID_EXT_* for third-party/connector engines



GitHub RepoRequest for Change (RFC)