Transferring Text to MDX

0. IMPORTANT: Update the frontmatter of all engines!

Add the following to the frontmatter (example data):

---
id: AAE
title: AAE Engine
description: >-
  Autonomous validation, consistency testing, and issuance of digital assurance
  proofs.
slug: /micro-engines/aae-engine
hub: computation-hub-calcs
omr_module_ids:
  - computation-hub
mice:
  meid: MEID_ASRE_AAE
  category: VALI
  domain: assurance
  input_types:
    - json
    - registry
    - api_ref
  supported_modes:
    - validation
    - signal
    - scoring
  api:
    route: /api/mice/MEID_CALC02
    method: POST
  metric_types_supported:
    - assurance.proof.hash_parity
    - assurance.governance.approval_validity
    - assurance.temporal.continuity
    - assurance.trust.anomaly
    - assurance.statement
  source_map_ref: EngineAliasMap
source_file: /computation-hub-calcs/micro-engines/aae-engine.mdx
sidebar_label: Autonomous Assurance Engine
sidebar_position: 28
doc_type: spec
status: review
legacy_manual_ref: 107.27.
version: 0.1.0
owners:
  - cto@viroway.com
last_updated: 2025-12-06T00:00:00.000Z
audience:
  - internal-cto
  - internal-engineering-core
  - internal-management
  - external-dev-core
allowed_users:
  - pedersen@viroway.com
tags:
  - mice
  - assurance
  - validation
  - trust
  - governance
  - meta-signal
  - tier-0
security_level: critical
classification:
  - Assurance-Engine
jira:
  epic: ZYZ-474
---

Empty (for easier copying):

---
id: 
title: 
description: >-
  ...
slug: 
hub: ...
omr_module_ids:
  - ...
mice: # MUST BE UPDATED FOR EACH MICE!!!!!!!!!!!!
  meid: ...
  category: CALC
  # CALC | VALI | TAGG | AGGR | TRANS | SCEN | SCORE | LINK | ALERT | META

  # active | experimental | deprecated | retired

  domain: emissions
  # emissions | energy | climate_risk | biodiversity | water | pollution
  # materials | supply_chain | products | finance | taxonomy
  # social | governance | compliance | assurance | meta

  input_types:
    - json
  # json | csv | xlsx | xml | api_ref | registry

  supported_modes:
    - calculation
  # calculation | validation | classification | aggregation
  # projection | scoring | signal

  api:
    route: /api/mice/MEID_CALC02
    # E.g. "/api/mice/MEID_CALC02_v1"
    method: POST
    # POST | GET | PUT | PATCH

  metric_types_supported:
    - ghg.scope3.cat6.business_travel
    # E.g. "ghg.scope3.cat6.business_travel"
  source_map_ref:    # optional: ties to the “Engine Alias Map”
source_file: /.../....mdx
sidebar_label: ...
sidebar_position: 
doc_type: 
status: 
legacy_manual_ref: N/A
version: 0.1.0
owners:
  - cto@viroway.com
last_updated: 2026-xx-xxT00:00:00.000Z
audience:
  - internal-cto
  - 
  - 
  - 
allowed_users:
  - pedersen@viroway.com
tags:
  - 
  - 
  - 
  - 
  - 
  - 
  - 
security_level: 
classification:
  - 
jira:
  epic:
---

Examples of usage:

Engine Metadatareview

ID: AGGR
Code: MICE_AGGR
Type: micro_engine
Name: Aggregation Engines
Version: 0.1.0

Operational Metadata

{
  "id": "AGGR",
  "engine_kind": "micro_engine",
  "title": "Aggregation Engines",
  "name": "Aggregation Engines",
  "code": "MICE_AGGR",
  "hub": "computation-hub-calcs",
  "owner": "computation-hub-calcs",
  "omr_module_ids": [
    "TBD"
  ],
  "status": "review",
  "version": "0.1.0",
  "tags": [
    "mice",
    "aggr"
  ],
  "audience": [
    "internal-cto",
    "internal-engineering-core",
    "internal-management",
    "external-dev-core"
  ],
  "slug": "/mice-types/aggregation-engines",
  "source_file": "/computation-hub-calcs/micro-engines/mice-types/aggregation-engines.mdx",
  "lifecycle": {
    "status": "review",
    "semver": "0.1.0"
  }
}

Graph View

AGGR depends on:

No related engines declared.

See also the Engines Registry.

1. Do hybrid batch work

👉 Migrate content in batches of ~5–8 chapters, and immediately do a “linking & grounding pass” for that batch before moving on.

That gives you:

Speed (batching)
Accuracy (context still fresh)
Structural integrity (foreign keys, CMIDs, signals don’t drift)

Avoid both extremes:

❌ Linking everything immediately (too slow)
❌ Migrating everything first (you’ll forget intent)

Why This Matters in The System (not generic docs) The manual is not prose-heavy documentation. It’s a knowledge graph with:

CMIDs
SSSR signals
Tables
Engines
Policies
Evidence chains
Search + Ask ZARA as consumers

That means linking is not cosmetic—it defines semantics.

If linking is you deferred too long:

You’ll lose why a table existed
CMIDs will drift from intent
You’ll re-read chapters multiple times
You’ll “paper over” missing structure instead of fixing it cleanly

2. The Optimal Workflow (Proven on Complex Manuals)

Phase 1 — Content Migration (Fast, Clean)

For each chapter:

Move text over
Light cleanup only (headings, lists, clarity)
DO NOT over-link
Add placeholders, not links

Example placeholders:

This keeps momentum high.

Phase 2 — Immediate Grounding Pass (Critical)

After 5–8 chapters, stop and do a grounding pass: For each chapter:

Identify:
CMIDs referenced
Tables implied
Signals used
Engines involved
Add:
<TableSignals />
<AuditEvidenceLink />
Cross-links
Foreign-key references

Because:

You still remember why things were written
You can normalize patterns across chapters
You catch inconsistencies early

Phase 3 — Structural Normalization (Light, Fast)

After grounding the batch:

Align naming
Ensure consistent CMID usage
Check signal/table reuse
Update registries if needed

This is where the system starts to self-reinforce.

3. A Concrete Rule of Thumb

Use this heuristic:

If you can still explain the chapter to someone else without re-reading it, you can still link it correctly.

Once you can’t, linking quality drops sharply.

That usually happens after ~8 chapters.

4. Practical Guardrails (Very Important)

4.1. Never leave a chapter “structurally ambiguous”

Even if links aren’t added yet, leave markers:

You can later grep for these. (See ch 5.)

4.2. Treat tables as first-class citizens

If a chapter uses data:

Either link the table now
Or explicitly mark:

Silence is dangerous.

4.3. CMIDs: batch-resolve, don’t guess

When migrating:

Write the concept in text
Resolve to CMID during grounding pass

This avoids premature CMID fragmentation.

4.4. Why This Is the Right Pace?

We’ve already built:

Registries
Search
Evidence links
Live graphs
Asset linking
Audience gating

That means structure work compounds.

Doing it in disciplined batches lets:

Search improve incrementally
Ask ZARA get smarter as we go
Errors surface early, not at the end

5. Standardized Markers

To manage deferred structure work we use specific markers in the text. Think of those comments as first-class TODO markers and reliably surface them later.

HTML comments like:

Marker	Meaning
`<!-- ZAYAZ-TODO: STRUCTURE -->`	Needs structural review
`<!-- ZAYAZ-TODO: ADD-CMID -->`	Metric not yet canonical
`<!-- ZAYAZ-TODO: ADD-FK -->`	Foreign key / table relation missing
`<!-- ZAYAZ-TODO: LINK-SSSR -->`	Signal registry link missing
`<!-- ZAYAZ-TODO: DATA-SOURCE -->`	Underlying data not identified
`<!-- ZAYAZ-TODO: EVIDENCE -->`	Audit evidence path missing
`<!-- ZAYAZ-TODO: POLICY -->`	Connect/link to Policy.

are:

✅ Ignored by Docusaurus rendering
✅ Preserved in MDX
✅ Searchable via scripts / grep
✅ Non-destructive (won’t break builds)
✅ Version-controllable

Optinally add a message:

`<!-- ZAYAZ-TODO: ADD-FK | Link dim_countries.iso_cc_id -> fact_emissions.country_id -->`

5.1. How to Get a List Later (Multiple Options)

Option 1 — Simple grep (fastest) From repo root:

grep -R "<!-- STRUCTURE REVIEW REQUIRED -->" content/

Or all structural markers at once:

grep -R "<!-- ZAYAZ-TODO: " content/ | grep -E "STRUCTURE|ADD-CMID|ADD-FK|LINK-SSSR|DATA-SOURCE|EVIDENCE"

Example output:

content/computation-hub-calcs/rif/risk-calibration.mdx:<!-- ZAYAZ-TODO: ADD-CMID -->
content/policies/env/climate-change.mdx:<!-- ZAYAZ-TODO: ADD-FK -->

Clean extract:

grep -R "<!-- ZAYAZ-TODO:" content/

This is future-proof for:

Building a report
Turning them into GitHub issues
Showing a “documentation completeness dashboard”

6. Search, Ask ZARA and ToDo — How to Use

6.1. Search

Use the search field to find relevant ZAYAZ documentation using semantic search.

You can:

Search by concepts, not just exact words

e.g. carbon passport lifecycle

Filter by section context automatically (engine, module, appendix)
Search for work-in-progress items using TODO markers:
todo → all documents with open TODOs
todo:STRUCTURE → only structure-related TODOs
todo:STRUCTURE,ADD_FK → documents matching all listed tags

Search results return the most relevant sections, not entire documents.

6.2. Ask ZARA

Ask ZARA is an AI assistant grounded only in the indexed ZAYAZ documentation.

You can:

Ask technical or conceptual questions

“How does FIRM model transition risk?”

Ask about incomplete areas

“What is missing in the ESG signal registry?”

Ask about TODOs and gaps

>“What STRUCTURE TODOs exist in the AI Intelligence Layer?”

ZARA will:

Cite relevant document sections when possible
Explicitly say when information is missing or incomplete
Never invent undocumented behavior

User query	Result
todo	All docs with any TODO
todo:any	All docs with any TODO
todo:STRUCTURE	Only docs with `<!-- ZAYAZ-TODO: STRUCTURE -->`
todo:STRUCTURE,ADD_FK	Docs that contain both tags
todo + normal words	(Intentionally not allowed — keeps syntax strict)
Normal question	Falls back to semantic search

6.3. Best Practice

Use Search to explore and scan
Use Ask ZARA to synthesize, explain, or identify gaps
Use  markers in docs to surface future work automatically

6.4. Search Syntax (Operators)

Operator	Example	Meaning
todo	todo	Return any doc section that contains at least one marker
todo:TAG	todo:STRUCTURE	Only docs that have the TODO tag STRUCTURE
todo:TAG1,TAG2	todo:STRUCTURE,ADD_FK	Docs that contain all listed TODO tags
CMID lookup	CMID-ZARA-00001	Finds pages/sections where the CMID was indexed
Slug scope	filterSlugPrefix="/ai-intelligence-layer"	API filter to restrict search to a subtree (UI can expose this later)
Type scope	filterSourceType="mdx"	API filter to limit to MDX vs JSON vs Excel docs

Ensure there is a corresponding associated-files directory for all MDX files

For every MDX file under:

/workspaces/zayaz-docs/content/

we want to ensure there is a corresponding directory under:

/workspaces/zayaz-docs/code/associated-files/

with:

the same relative path
same filename without .mdx
containing a .gitkeep

Path mapping rule (canonical)

content/computation-hub-calcs/micro-engines/ewc-hazard-classification.mdx
↓
code/associated-files/computation-hub-calcs/micro-engines/ewc-hazard-classification/
  └── .gitkeep

note

If you forget to add the folder, use the following script to automate the creation of the folder(s).

Run this from:

/workspaces/zayaz-docs

Script

#!/bin/bash

CONTENT_ROOT="content"
ASSOCIATED_ROOT="code/associated-files"

EXCLUDE_DIRS=(
  "system"
  "system-info"
  "templates"
  "under-development"
)

# Build find exclusion arguments
EXCLUDES=""
for d in "${EXCLUDE_DIRS[@]}"; do
  EXCLUDES="$EXCLUDES -path ./$CONTENT_ROOT/$d -prune -o"
done

# Find all .mdx files respecting exclusions
eval find "./$CONTENT_ROOT" $EXCLUDES -type f -name "*.mdx" -print | while read -r mdx; do
  # Remove leading content/
  rel_path="${mdx#./$CONTENT_ROOT/}"

  # Remove .mdx extension
  base_path="${rel_path%.mdx}"

  # Target directory
  target_dir="./$ASSOCIATED_ROOT/$base_path"

  # Create directory only if it does not exist
  if [ ! -d "$target_dir" ]; then
    echo "Creating: $target_dir"
    mkdir -p "$target_dir"
    touch "$target_dir/.gitkeep"
  fi
done

What this script guarantees

Existing directories are untouched
Existing .gitkeep files are untouched
Only missing directories are created
Mirrors the content structure exactly
Safe for Git history
Safe for CI
Safe for repeated runs

Insert the Tagging Info in Frontmatters of Engines:

description: "..." # REQUIRED: one sentence, decision-grade, what it does

engine: # REQUIRED for all engines (non-MICE)
  eid: ...           # OPTIONAL (only if the engine is executable/callable)
  meid: ...          # REQUIRED: ENGINE category (see list below)
  category: ...      # REQUIRED: ENGINE category (see list below)

  domain: ...        # REQUIRED (same domain vocabulary as MICE)

  input_types:       # OPTIONAL but recommended
    - json           # json | csv | xlsx | xml | api_ref | registry

  supported_modes:   # REQUIRED: what the engine does
    - routing        # examples below

  api:               # OPTIONAL but recommended if callable
    route: /api/engine/...
    method: POST     # POST | GET | PUT | PATCH

  capabilities:      # OPTIONAL: helps architecture + diagrams
    provides:
      - ...
    consumes:
      - ...

  emits_signals:     # OPTIONAL: which USO/CSI families it emits
    uso:
      - ...
    csi:
      - ...

  source_map_ref: EngineAliasMap # OPTIONAL (same as MICE)

Engine category (fixed enum)

Use one of these so we don’t invent new categories per page:

REGISTRY (registries, canonical stores, lookups)
ROUTING (dispatch, orchestration, routing, resolution)
VALIDATION (validation, integrity, QA, consistency checks)
GOVERNANCE (RBAC, policy enforcement, approvals)
AUDIT (ledger, evidence, lineage, replay, tamper detection)
TRANSLATION (localization, translation, mapping)
EXPORT (XBRL/iXBRL packaging, converters)
AGENT_RUNTIME (agent execution/supervision runtime)
SIMULATION (Monte Carlo, Bayesian scenario engines)
COMPUTE (general compute engines that aren’t MICE)
INTEGRATION (connectors, third-party services)

Engine supported_modes (examples)

Pick what fits; multiple options can be used:

routing
orchestration
registry
validation
audit
authorization
identity
translation
export
simulation
scoring
integration

EID:

EID_SYS_* for SIS/platform engines (SSSR, RBAC, ALTD)
EID_MOD_<MOD>_* for module engines (e.g., EID_MOD_ZARA_TRPG)
EID_EXT_* for third-party/connector engines

GitHub Repo Request for Change (RFC)

0. IMPORTANT: Update the frontmatter of all engines!​

Engine Metadatareview

1. Do hybrid batch work​

2. The Optimal Workflow (Proven on Complex Manuals)​

Phase 1 — Content Migration (Fast, Clean)​

Phase 2 — Immediate Grounding Pass (Critical)​

Phase 3 — Structural Normalization (Light, Fast)​

3. A Concrete Rule of Thumb​

4. Practical Guardrails (Very Important)​

4.1. Never leave a chapter “structurally ambiguous”​

4.2. Treat tables as first-class citizens​

4.3. CMIDs: batch-resolve, don’t guess​

4.4. Why This Is the Right Pace?​

5. Standardized Markers​

5.1. How to Get a List Later (Multiple Options)​

6. Search, Ask ZARA and ToDo — How to Use​

6.1. Search​

6.2. Ask ZARA​

6.3. Best Practice​

6.4. Search Syntax (Operators)​

Ensure there is a corresponding associated-files directory for all MDX files​

Path mapping rule (canonical)​

Insert the Tagging Info in Frontmatters of Engines:​

Engine category (fixed enum)​

Engine supported_modes (examples)​

EID:​