Transferring Text to MDX
0. IMPORTANT: Update the frontmatter of all engines!
Add the following to the frontmatter (example data):
---
id: AAE
title: AAE Engine
description: >-
Autonomous validation, consistency testing, and issuance of digital assurance
proofs.
slug: /micro-engines/aae-engine
hub: computation-hub-calcs
omr_module_ids:
- computation-hub
mice:
meid: MEID_ASRE_AAE
category: VALI
domain: assurance
input_types:
- json
- registry
- api_ref
supported_modes:
- validation
- signal
- scoring
api:
route: /api/mice/MEID_CALC02
method: POST
metric_types_supported:
- assurance.proof.hash_parity
- assurance.governance.approval_validity
- assurance.temporal.continuity
- assurance.trust.anomaly
- assurance.statement
source_map_ref: EngineAliasMap
source_file: /computation-hub-calcs/micro-engines/aae-engine.mdx
sidebar_label: Autonomous Assurance Engine
sidebar_position: 28
doc_type: spec
status: review
legacy_manual_ref: 107.27.
version: 0.1.0
owners:
- cto@viroway.com
last_updated: 2025-12-06T00:00:00.000Z
audience:
- internal-cto
- internal-engineering-core
- internal-management
- external-dev-core
allowed_users:
- pedersen@viroway.com
tags:
- mice
- assurance
- validation
- trust
- governance
- meta-signal
- tier-0
security_level: critical
classification:
- Assurance-Engine
jira:
epic: ZYZ-474
---
Empty (for easier copying):
---
id:
title:
description: >-
...
slug:
hub: ...
omr_module_ids:
- ...
mice: # MUST BE UPDATED FOR EACH MICE!!!!!!!!!!!!
meid: ...
category: CALC
# CALC | VALI | TAGG | AGGR | TRANS | SCEN | SCORE | LINK | ALERT | META
# active | experimental | deprecated | retired
domain: emissions
# emissions | energy | climate_risk | biodiversity | water | pollution
# materials | supply_chain | products | finance | taxonomy
# social | governance | compliance | assurance | meta
input_types:
- json
# json | csv | xlsx | xml | api_ref | registry
supported_modes:
- calculation
# calculation | validation | classification | aggregation
# projection | scoring | signal
api:
route: /api/mice/MEID_CALC02
# E.g. "/api/mice/MEID_CALC02_v1"
method: POST
# POST | GET | PUT | PATCH
metric_types_supported:
- ghg.scope3.cat6.business_travel
# E.g. "ghg.scope3.cat6.business_travel"
source_map_ref: # optional: ties to the “Engine Alias Map”
source_file: /.../....mdx
sidebar_label: ...
sidebar_position:
doc_type:
status:
legacy_manual_ref: N/A
version: 0.1.0
owners:
- cto@viroway.com
last_updated: 2026-xx-xxT00:00:00.000Z
audience:
- internal-cto
-
-
-
allowed_users:
- pedersen@viroway.com
tags:
-
-
-
-
-
-
-
security_level:
classification:
-
jira:
epic:
---
Examples of usage:
Engine Metadatareview
- ID: AGGR
- Code: MICE_AGGR
- Type: micro_engine
- Name: Aggregation Engines
- Version: 0.1.0
Operational Metadata
{
"id": "AGGR",
"engine_kind": "micro_engine",
"title": "Aggregation Engines",
"name": "Aggregation Engines",
"code": "MICE_AGGR",
"hub": "computation-hub-calcs",
"owner": "computation-hub-calcs",
"omr_module_ids": [
"TBD"
],
"status": "review",
"version": "0.1.0",
"tags": [
"mice",
"aggr"
],
"audience": [
"internal-cto",
"internal-engineering-core",
"internal-management",
"external-dev-core"
],
"slug": "/mice-types/aggregation-engines",
"source_file": "/computation-hub-calcs/micro-engines/mice-types/aggregation-engines.mdx",
"lifecycle": {
"status": "review",
"semver": "0.1.0"
}
}Graph View
AGGR depends on:
No related engines declared.
See also the Engines Registry.
1. Do hybrid batch work
👉 Migrate content in batches of ~5–8 chapters, and immediately do a “linking & grounding pass” for that batch before moving on.
That gives you:
- Speed (batching)
- Accuracy (context still fresh)
- Structural integrity (foreign keys, CMIDs, signals don’t drift)
Avoid both extremes:
- ❌ Linking everything immediately (too slow)
- ❌ Migrating everything first (you’ll forget intent)
Why This Matters in The System (not generic docs) The manual is not prose-heavy documentation. It’s a knowledge graph with:
- CMIDs
- SSSR signals
- Tables
- Engines
- Policies
- Evidence chains
- Search + Ask ZARA as consumers
That means linking is not cosmetic—it defines semantics.
If linking is you deferred too long:
- You’ll lose why a table existed
- CMIDs will drift from intent
- You’ll re-read chapters multiple times
- You’ll “paper over” missing structure instead of fixing it cleanly
2. The Optimal Workflow (Proven on Complex Manuals)
Phase 1 — Content Migration (Fast, Clean)
For each chapter:
- Move text over
- Light cleanup only (headings, lists, clarity)
- DO NOT over-link
- Add placeholders, not links
Example placeholders:
<!-- ZAYAZ-TODO: STRUCTURE -->
<!-- ZAYAZ-TODO: LINK-SSSR -->
<!-- LINK: POLICY -->
This keeps momentum high.
Phase 2 — Immediate Grounding Pass (Critical)
After 5–8 chapters, stop and do a grounding pass: For each chapter:
- Identify:
- CMIDs referenced
- Tables implied
- Signals used
- Engines involved
- Add:
<TableSignals /><AuditEvidenceLink />- Cross-links
- Foreign-key references
Because:
- You still remember why things were written
- You can normalize patterns across chapters
- You catch inconsistencies early
Phase 3 — Structural Normalization (Light, Fast)
After grounding the batch:
- Align naming
- Ensure consistent CMID usage
- Check signal/table reuse
- Update registries if needed
This is where the system starts to self-reinforce.
3. A Concrete Rule of Thumb
Use this heuristic:
If you can still explain the chapter to someone else without re-reading it, you can still link it correctly.
Once you can’t, linking quality drops sharply.
That usually happens after ~8 chapters.
4. Practical Guardrails (Very Important)
4.1. Never leave a chapter “structurally ambiguous”
Even if links aren’t added yet, leave markers:
<!-- ZAYAZ-TODO: STRUCTURE -->
You can later grep for these. (See ch 5.)
4.2. Treat tables as first-class citizens
If a chapter uses data:
- Either link the table now
- Or explicitly mark:
<!-- ZAYAZ-TODO: DATA-SOURCE -->
Silence is dangerous.
4.3. CMIDs: batch-resolve, don’t guess
When migrating:
- Write the concept in text
- Resolve to CMID during grounding pass
This avoids premature CMID fragmentation.
4.4. Why This Is the Right Pace?
We’ve already built:
- Registries
- Search
- Evidence links
- Live graphs
- Asset linking
- Audience gating
That means structure work compounds.
Doing it in disciplined batches lets:
- Search improve incrementally
- Ask ZARA get smarter as we go
- Errors surface early, not at the end
5. Standardized Markers
To manage deferred structure work we use specific markers in the text. Think of those comments as first-class TODO markers and reliably surface them later.
HTML comments like:
| Marker | Meaning |
|---|---|
<!-- ZAYAZ-TODO: STRUCTURE --> | Needs structural review |
<!-- ZAYAZ-TODO: ADD-CMID --> | Metric not yet canonical |
<!-- ZAYAZ-TODO: ADD-FK --> | Foreign key / table relation missing |
<!-- ZAYAZ-TODO: LINK-SSSR --> | Signal registry link missing |
<!-- ZAYAZ-TODO: DATA-SOURCE --> | Underlying data not identified |
<!-- ZAYAZ-TODO: EVIDENCE --> | Audit evidence path missing |
<!-- ZAYAZ-TODO: POLICY --> | Connect/link to Policy. |
are:
- ✅ Ignored by Docusaurus rendering
- ✅ Preserved in MDX
- ✅ Searchable via scripts / grep
- ✅ Non-destructive (won’t break builds)
- ✅ Version-controllable
Optinally add a message:
`<!-- ZAYAZ-TODO: ADD-FK | Link dim_countries.iso_cc_id -> fact_emissions.country_id -->`
5.1. How to Get a List Later (Multiple Options)
Option 1 — Simple grep (fastest) From repo root:
grep -R "<!-- STRUCTURE REVIEW REQUIRED -->" content/
Or all structural markers at once:
grep -R "<!-- ZAYAZ-TODO: " content/ | grep -E "STRUCTURE|ADD-CMID|ADD-FK|LINK-SSSR|DATA-SOURCE|EVIDENCE"
Example output:
content/computation-hub-calcs/rif/risk-calibration.mdx:<!-- ZAYAZ-TODO: ADD-CMID -->
content/policies/env/climate-change.mdx:<!-- ZAYAZ-TODO: ADD-FK -->
Clean extract:
grep -R "<!-- ZAYAZ-TODO:" content/
This is future-proof for:
- Building a report
- Turning them into GitHub issues
- Showing a “documentation completeness dashboard”
6. Search, Ask ZARA and ToDo — How to Use
6.1. Search
Use the search field to find relevant ZAYAZ documentation using semantic search.
You can:
- Search by concepts, not just exact words
e.g. carbon passport lifecycle
- Filter by section context automatically (engine, module, appendix)
- Search for work-in-progress items using TODO markers:
- todo → all documents with open TODOs
- todo:STRUCTURE → only structure-related TODOs
- todo:STRUCTURE,ADD_FK → documents matching all listed tags
Search results return the most relevant sections, not entire documents.
6.2. Ask ZARA
Ask ZARA is an AI assistant grounded only in the indexed ZAYAZ documentation.
You can:
- Ask technical or conceptual questions
“How does FIRM model transition risk?”
- Ask about incomplete areas
“What is missing in the ESG signal registry?”
- Ask about TODOs and gaps
>“What STRUCTURE TODOs exist in the AI Intelligence Layer?”
ZARA will:
- Cite relevant document sections when possible
- Explicitly say when information is missing or incomplete
- Never invent undocumented behavior
| User query | Result |
|---|---|
| todo | All docs with any TODO |
| todo:any | All docs with any TODO |
| todo:STRUCTURE | Only docs with <!-- ZAYAZ-TODO: STRUCTURE --> |
| todo:STRUCTURE,ADD_FK | Docs that contain both tags |
| todo + normal words | (Intentionally not allowed — keeps syntax strict) |
| Normal question | Falls back to semantic search |
6.3. Best Practice
- Use Search to explore and scan
- Use Ask ZARA to synthesize, explain, or identify gaps
- Use
<!-- ZAYAZ-TODO: TAG -->markers in docs to surface future work automatically
6.4. Search Syntax (Operators)
| Operator | Example | Meaning |
|---|---|---|
| todo | todo | Return any doc section that contains at least one marker |
| todo:TAG | todo:STRUCTURE | Only docs that have the TODO tag STRUCTURE |
| todo:TAG1,TAG2 | todo:STRUCTURE,ADD_FK | Docs that contain all listed TODO tags |
| CMID lookup | CMID-ZARA-00001 | Finds pages/sections where the CMID was indexed |
| Slug scope | filterSlugPrefix="/ai-intelligence-layer" | API filter to restrict search to a subtree (UI can expose this later) |
| Type scope | filterSourceType="mdx" | API filter to limit to MDX vs JSON vs Excel docs |
Ensure there is a corresponding associated-files directory for all MDX files
For every MDX file under:
/workspaces/zayaz-docs/content/
we want to ensure there is a corresponding directory under:
/workspaces/zayaz-docs/code/associated-files/
with:
- the same relative path
- same filename without .mdx
- containing a .gitkeep
Path mapping rule (canonical)
content/computation-hub-calcs/micro-engines/ewc-hazard-classification.mdx
↓
code/associated-files/computation-hub-calcs/micro-engines/ewc-hazard-classification/
└── .gitkeep
If you forget to add the folder, use the following script to automate the creation of the folder(s).
Run this from:
/workspaces/zayaz-docs
Script
#!/bin/bash
CONTENT_ROOT="content"
ASSOCIATED_ROOT="code/associated-files"
EXCLUDE_DIRS=(
"system"
"system-info"
"templates"
"under-development"
)
# Build find exclusion arguments
EXCLUDES=""
for d in "${EXCLUDE_DIRS[@]}"; do
EXCLUDES="$EXCLUDES -path ./$CONTENT_ROOT/$d -prune -o"
done
# Find all .mdx files respecting exclusions
eval find "./$CONTENT_ROOT" $EXCLUDES -type f -name "*.mdx" -print | while read -r mdx; do
# Remove leading content/
rel_path="${mdx#./$CONTENT_ROOT/}"
# Remove .mdx extension
base_path="${rel_path%.mdx}"
# Target directory
target_dir="./$ASSOCIATED_ROOT/$base_path"
# Create directory only if it does not exist
if [ ! -d "$target_dir" ]; then
echo "Creating: $target_dir"
mkdir -p "$target_dir"
touch "$target_dir/.gitkeep"
fi
done
What this script guarantees
- Existing directories are untouched
- Existing .gitkeep files are untouched
- Only missing directories are created
- Mirrors the content structure exactly
- Safe for Git history
- Safe for CI
- Safe for repeated runs
Insert the Tagging Info in Frontmatters of Engines:
description: "..." # REQUIRED: one sentence, decision-grade, what it does
engine: # REQUIRED for all engines (non-MICE)
eid: ... # OPTIONAL (only if the engine is executable/callable)
meid: ... # REQUIRED: ENGINE category (see list below)
category: ... # REQUIRED: ENGINE category (see list below)
domain: ... # REQUIRED (same domain vocabulary as MICE)
input_types: # OPTIONAL but recommended
- json # json | csv | xlsx | xml | api_ref | registry
supported_modes: # REQUIRED: what the engine does
- routing # examples below
api: # OPTIONAL but recommended if callable
route: /api/engine/...
method: POST # POST | GET | PUT | PATCH
capabilities: # OPTIONAL: helps architecture + diagrams
provides:
- ...
consumes:
- ...
emits_signals: # OPTIONAL: which USO/CSI families it emits
uso:
- ...
csi:
- ...
source_map_ref: EngineAliasMap # OPTIONAL (same as MICE)
Engine category (fixed enum)
Use one of these so we don’t invent new categories per page:
REGISTRY(registries, canonical stores, lookups)ROUTING(dispatch, orchestration, routing, resolution)VALIDATION(validation, integrity, QA, consistency checks)GOVERNANCE(RBAC, policy enforcement, approvals)AUDIT(ledger, evidence, lineage, replay, tamper detection)TRANSLATION(localization, translation, mapping)EXPORT(XBRL/iXBRL packaging, converters)AGENT_RUNTIME(agent execution/supervision runtime)SIMULATION(Monte Carlo, Bayesian scenario engines)COMPUTE(general compute engines that aren’t MICE)INTEGRATION(connectors, third-party services)
Engine supported_modes (examples)
Pick what fits; multiple options can be used:
- routing
- orchestration
- registry
- validation
- audit
- authorization
- identity
- translation
- export
- simulation
- scoring
- integration
EID:
EID_SYS_*for SIS/platform engines (SSSR, RBAC, ALTD)EID_MOD_<MOD>_*for module engines (e.g.,EID_MOD_ZARA_TRPG)EID_EXT_*for third-party/connector engines