Jira progress: loading…

ZAR-CORE

ZAR Ruleset Factory

1. Introduction

The ZAR Ruleset Factory is the deterministic production line for creating and publishing rulesets and ruleset bundles as versioned YAML artifacts.

It exists to:

Generate ruleset / bundle YAML artifacts (reviewable, diffable)
Validate each artifact against the ZAR schemas (ruleset.v1, ruleset_bundle.v1, etc.)
Canonicalize + hash artifacts to produce immutable ZAR refs
Produce a reviewable ChangeSet (PLAN) before anything is applied
Apply changes only after explicit approval (APPLY), updating registries and pointers safely

This same “production line” pattern generalizes to other ZAR registries later (signals, tables, FOGE datasets, requirements mappings), but rulesets/bundles are the first-class product.

See the ZAR Rules Registry for the artifact and registry model…

2. YAML is the executable truth — and must be named deterministically

Reality: the YAML file is the ruleset (executable artifact source), and ZAR turns it into an immutable artifact via canonicalization + hash.

We enforce strict naming conventions on YAML, because:

it becomes the “review surface”
it is the stable “source representation”
it can be auto-linked in docs
it can be auto-indexed into registries

Deterministic filename pattern (simple, human-friendly)

For rulesets:

acct-crawler-tag-detection-1_0_0.yaml
acct-crawler-classification-1_0_0.yaml
acct-crawler-reconciliation-policy-1_0_0.yaml

For bundles:

acct-crawler-default-bundle-1_0_0.yaml
transition-roi-default-bundle-1_0_0.yaml

ZAR ref stays hash-based:

ZAR:ruleset:acct_crawler_tag_detection@sha256:<hash>
ZAR:ruleset_bundle:acct_crawler_default@sha256:<hash>

Do not put version in the ZAR ref — the hash is the version. The human semantic version stays in crid and/or lifecycle.changelog + filename.

Semver is for humans and release policy (filenames, changelog, CRID); hash is for identity (ZAR refs, immutability, caching).

3. Design Principles

The ZAR Ruleset Factory is governed by the following design principles:

3.1. Single Source of Truth

The YAML artifact is the executable contract. Registries, hashes, references, and documentation links are derived — never manually authored.

3.2. Determinism Over Magic

Given the same intent and inputs, the Factory must always produce the same canonical artifact, hash, and ChangeSet. No hidden state. No implicit mutations.

3.3. Review Before Activation

Every change must pass through a PLAN phase that generates a complete, replayable ChangeSet. Nothing is applied directly.

3.4. Separation of Intent and Execution

Intent expresses what is desired. The Factory determines how it is rendered, validated, canonicalized, and registered.

3.5. Governance by Construction

Schema validation, canonicalization, idempotent SQL, and activation gating are built into the pipeline — not added as optional checks.

4. Core Guarantees

The ZAR Ruleset Factory provides the following non-negotiable guarantees:

Immutability

Every registered artifact is identified by a canonical SHA-256 hash. The ZAR reference (ZAR:ruleset:<name>@sha256:<hash>) is immutable. Changing content produces a new artifact — never a mutation.

Auditability

Every APPLY operation:

references a specific changeset.yaml
records approval identity and timestamp
produces a deterministic SQL plan
can be replayed or reconstructed from artifacts and logs

No registry state exists without traceable origin.

Reproducibility

Running PLAN with the same inputs must produce:

identical canonical artifacts
identical hashes
identical SQL preview
identical ChangeSet manifest

Environment or execution order must not alter results.

Idempotency

All generated SQL follows an UPSERT/merge pattern. Re-applying the same ChangeSet does not create duplicate rows or inconsistent state.

Transactional Safety

APPLY executes within database transactions. Activation pointer updates occur last. Any failure results in full rollback — no partial activation is possible.

5. Threat Model & Failure Modes

The ZAR Ruleset Factory is designed under the assumption that configuration systems fail in predictable ways. The following threat model defines what must never happen and how the architecture mitigates those risks.

5.1. Drift Between Artifact and Registry

Threat: Registry rows diverge from the YAML artifact (manual DB edits, partial updates, inconsistent writes).

Mitigation:

Registry rows are deterministically derived from canonical artifact content.
APPLY operations are transactional.
No manual registry writes are permitted outside the Factory pipeline.
content_hash is always stored and traceable to the source YAML.

Result: Registry state is always reconstructible from artifact state.

5.2. Partial Activation

Threat: A ruleset bundle is partially applied (registry updated but pointers not updated, or vice versa).

Mitigation:

APPLY executes inside a database transaction.
Activation pointers update last.
Any failure triggers full rollback.

Result: No partially active rulesets can exist.

5.3. Non-Deterministic Output

Threat: Running PLAN twice with the same intent produces different artifacts, hashes, or SQL plans.

Mitigation:

Canonicalization specification enforces stable field ordering and normalization.
Hashes are computed only on canonical form.
No environment-dependent logic in artifact generation.
No hidden timestamps injected into artifact bodies.

Result: PLAN output is reproducible and verifiable.

5.4. Unauthorized Production Changes

Threat: Rulesets are activated in production without review or proper approval.

Mitigation:

PLAN and APPLY are strictly separated.
APPLY requires explicit approval identity and timestamp.
Environment gating (e.g., prod requires approved lifecycle status).
Optional PR-based workflow for production activation.

Result: No unreviewed production mutation is possible by design.

5.5. Versioning Ambiguity

Threat: Two artifacts share the same human version but differ in content.

Mitigation:

ZAR references are hash-based, not version-based.
Semantic version is informational (CRID, filename).
Content hash is authoritative identity.

Result: Artifact identity is cryptographically unambiguous.

5.6. Silent Breaking Changes

Threat: A ruleset update changes enforcement mode, strictness, or compatibility bounds without visibility.

Mitigation:

PLAN produces diff report including:
enforcement_mode changes
strict_mode changes
compatibility bound changes
execution order changes
Preflight agent flags risk-level changes.
Approval checklist highlights high-impact modifications.

Result: Breaking changes cannot be silently introduced.

5.7. Schema Evolution Risk

Threat: Schema updates invalidate previously valid artifacts or introduce inconsistent interpretation.

Mitigation:

Schema versions are explicit (ruleset.v1, ruleset_bundle.v1, etc.).
Artifacts reference schema version explicitly.
Backward compatibility policy enforced by governance gate.

Result: Schema evolution is controlled and explicit.

6. Formal Invariants

We treat these as “mathematical-style” guarantees the Factory must uphold. If any invariant is violated, PLAN must fail (or APPLY must refuse).

Let:

A be an artifact YAML (ruleset or bundle)
$C(A)$ be the canonical JSON form produced by the canonicalization spec
$H(x)$ be $sha256(x)$ over bytes of $x$ (stable encoding)
$Ref(A)$ be the ZAR reference derived from $H(C(A))$

I1 — Canonical Hash Determinism

For any artifact $A$ , canonicalization is deterministic:

$C(A)$ is uniquely determined by $A$ and the canonicalization spec version.
Therefore: $H(C(A))$ is stable across machines and runs.

Formally:

If $A1 == A2$ (same semantic artifact fields), then $C(A1) == C(A2)$ and $H(C(A1)) == H(C(A2))$ .

I2 — Content Addressability

The ZAR reference is a pure function of canonical content:

$Ref(A)$ = "ZAR:<kind>:<artifact_name>@sha256:" + $H(C(A))$

No timestamps, filenames, or environment values may affect $Ref(A)$ .

I3 — Referential Integrity of PLAN

A PLAN output must be replayable:

Given plan.json (and access to the referenced input intent + canonicalization spec), APPLY must reproduce the same:
file bodies
hashes
ZAR refs
registry writes (modulo “already exists” idempotency)

I4 — No “Write Without Plan”

APPLY must not introduce new actions that were not present in PLAN:

$Actions(APPLY) ⊆ Actions(PLAN)$

If APPLY would write a file or registry row not listed in the plan, APPLY must fail.

I5 — Idempotent Writes

If APPLY is executed twice with the same plan_ref, the final system state is unchanged after the first successful apply.

Formally:

$Apply(Apply(S, plan), plan) = Apply(S, plan)$

I6 — Pointer Safety

Activation pointers (global/tenant/entity) are updated only after:

all artifacts validate
canonicalization/hashes match plan
all registry upserts succeeded

If any prior step fails, pointers must remain unchanged.

7. Lifecycle State Machine (Rulesets & Bundles)

We model lifecycle as a finite state machine. This applies to both ruleset and ruleset_bundle.

States

draft: editable, not production-eligible
approved: reviewed and eligible for production activation
frozen: immutable “blessed” state; content cannot change (only supersede)
superseded: replaced by a newer artifact (lineage link required)
deprecated: not recommended for new use; may remain for historical replay

Allowed Transitions

draft → approved (review + approval)
approved → frozen (optional governance hardening)
draft → deprecated (abandon)
approved → deprecated (policy change)
frozen → deprecated (sunset)
draft → superseded (rare; only if you allow draft replacement)
approved → superseded
frozen → superseded

Disallowed Transitions (examples)

deprecated → approved (must create a new artifact instead)
superseded → approved (must create a new artifact instead)
Any transition that changes artifact body while keeping same zar_ref (impossible by invariant)

Supersedence Rules

If an artifact becomes superseded, it must declare:

superseded_by: <zar_ref of successor> and the successor must declare:
supersedes: <zar_ref of predecessor>

This makes lineage a verified graph, not a best-effort convention.

System Diagram (Conceptual)

One-page conceptual view of the system (control plane vs deterministic execution vs storage):

Actors

Human (reviewer / engineer)
Docusaurus UI (Ruleset Generator)
Factory API (Cloudflare Worker / backend)
Storage (Git + S3)
Registries (Postgres)

Conceptual Blocks

UI / Control Plane

Collects high-level intent
Displays PLAN outputs (files, diffs, SQL preview, checklist)
Requires explicit approval to run APPLY

Factory Orchestrator

Validates intent
Resolves deterministic output paths from source_file / slug
Calls deterministic executors (canonicalize, hash, render, diff)
Produces plan + preflight report

Deterministic Executors

Schema validator
Canonicalizer
Hasher
Renderer (YAML output)
Diff generator
SQL planner (idempotent)

Persistence

Artifact bodies: Git repo (PR workflow) and/or S3
Registries/pointers: Postgres
1. **Governance Gate **
Enforces prod policies (approved required, audit required, etc.)
Blocks APPLY if policy fails

[Human]
  |
  v
[Docusaurus RulesetGenerator UI]
  |  (intent_input + ui_context)
  v
[Factory API / Orchestrator]
  |--> [Schema Validate]
  |--> [Canonicalize] --> [Hash] --> (ZAR refs)
  |--> [Render YAMLs + Diff]
  |--> [SQL Planner (idempotent)]
  |--> [Preflight Checklist]
  |
  v
[PLAN Output: plan.json + previews]
  |
  |  (approval)
  v
[APPLY]
  |--> Write artifacts (Git/S3)
  |--> Upsert registries (Postgres)
  |--> Update pointers LAST (Postgres)
  v
[Live state]

8. Component Interaction Sequence

PLAN flow (concise sequence)

UI sends POST /api/zar/factory/plan with:
- ui_context (source_file/slug/hub, env, actor)
- intent_input (meid, family, rulesets list, bundle config)
Factory:
- Validates request shape
- Resolves output folder deterministically from source_file
- Loads catalog/policy defaults (if applicable)
- Generates candidate artifact bodies (templates + intent merge)
- Validates against schemas
- Canonicalizes each artifact, computes sha256
- Produces proposed ZAR refs
- Produces SQL preview (UPSERTs)
- Produces preflight checklist (pass/warn/fail)
- Returns { ok, plan_ref, plan }
UI renders:
- Summary
- Files preview
- Diff
- SQL preview
- Checklist

APPLY flow (concise sequence)

UI requires approval toggle + preflight must be clean (or policy-defined).
UI sends POST /api/zar/factory/apply with:
- plan_ref
- approval block (who/when/comment)
Factory:
- Re-loads the plan
- Re-validates invariants (no drift)
- Enforces governance gates (prod rules)
- Writes artifacts (Git/S3)
- Executes registry UPSERTs (transaction)
- Updates pointers last (transaction)
- Emits result { ok, written_files, registry_updates }

9. The ZAR Ruleset Factory “production line”

The ZAR Ruleset Factory never performs irreversible changes without first generating a fully reviewable ChangeSet.

PLAN produces a complete, replayable output (artifacts + hashes + SQL preview).
APPLY is explicit, gated, and auditable.

Core workflow

Phase A — PLAN (safe, reviewable)

Generate deterministic YAML artifacts (rulesets/bundles) into a staging workspace
Validate each artifact against schema (ruleset.v1, ruleset_bundle.v1, etc.)
Canonicalize each artifact using canonicalization_spec (JSON canonical form)
Compute content_hash (sha256) and show it
Produce a deterministic “registration plan” (SQL + JSON preview) that would:

insert/update ruleset_registry
insert/update ruleset_bundle_registry
update pointer tables (global/tenant/entity bundle pointers)
create zar_artifact_index entries (optional but recommended)

Output (PLAN artifacts):

changeset.yaml — structured manifest (machine-readable)
changeset.sql — idempotent SQL preview (UPSERT-based)
changeset.report.md — human-readable review report
YAML artifacts written to staging or associated-files hierarchy

Phase B — APPLY (go-live) 6. Require explicit approval (flag + user identity) 7. Execute SQL inside transaction(s) 8. Persist artifacts to the registered store (ZAR storage path) and/or commit via Git/PR workflow 9. Emit:

ZAR:RegisterArtifact event(s)
optional DAL entry

10. Factory Command Surface (simple, future-proof)

The Factory builds on existing ZAR contract artifacts (YAML definitions), such as:

ruleset-register-1_0_0.yaml
register-any_artifact-1_0_0.yaml
canonicalization default

The ZAR Ruleset Factory exposes these “commands”:

zar.ruleset.plan Inputs:

ruleset “intent” (what we want: tag detection, reconciliation, thresholds, etc.)
target MEID
version bump policy (PATCH/MINOR/MAJOR)
desired strict_mode or enforcement defaults

Outputs:

YAML ruleset file(s)
computed hash(es)
proposed ZAR refs
SQL plan file(s)
a review report
computed canonical hash(es)

zar.bundle.plan Inputs:

bundle name
applies_to_meid
strict_mode
ruleset list (by ref or by local YAML paths)

Outputs:

YAML bundle file
computed hash, ZAR ref
SQL plan + review report

zar.apply Inputs:

changeset.yaml (immutable PLAN output)
explicit approval identity (approved_by, timestamp, comment)

Outputs:

DB updates applied
registry rows inserted/updated
“activation pointers” updated (global/tenant/entity)
go-live report

11. Governance Guardrails (Non-Negotiable)

These are the non-negotiable guardrails:

Review gates (must pass before apply)

✅ Schema validation (draft-2020-12)
✅ Canonicalization (using canonicalization_spec)
✅ Hash computed matches the canonical form
✅ SQL plan is idempotent (UPSERT / merge pattern)
✅ Target tables match table_registry/signal_registry expectations
✅ “diff report” includes:
- new vs supersedes
- execution order changes
- enforcement_mode changes
- strict_mode changes
- compatibility bounds changes

Operational safety

APPLY runs inside a transaction
Activation pointer updates execute last
Any failure triggers full rollback (no partial activation)

12. Artifact Storage Model (Docs-Aligned, Deterministic)

Ruleset and bundle YAML artifacts are stored beside their corresponding MDX documentation via the associated-files hierarchy.

Example:

code/associated-files/<doc-path>/rulesets/acct-crawler-tag-detection-1_0_0.yaml

This ensures:

Documentation and executable artifacts remain structurally aligned
Snippet components render the exact YAML that is registered
The Factory resolves output paths deterministically from source_file
No manual folder selection is required in the UI

The Factory derives the output folder from the MDX source_file or slug, guaranteeing that artifact location is predictable and reproducible.

Documentation and execution remain in sync by design.

13. Deterministic Registry Auto-Generation

Registry rows are not authored manually.

They are deterministically derived from the canonicalized artifact content.

Because the ruleset YAML already contains:

zar.artifact_name
zar.applies_to_meid
zrr.crid
zrr.domain
zrr.rule_type
zrr.severity
zrr.linked_signal_ids
zrr.linked_frameworks
zrr.enforcement_mode
zrr.fallback_logic
zrr.ontology_binding
lifecycle.created_by
lifecycle.created_at
content_hash
schema_ref

The ZAR Ruleset Factory can deterministically construct:

ruleset_registry row
ruleset_bundle_registry row
zar_artifact_index row(s)

No registry data is handwritten.
The YAML artifact is the single source of truth.

Because filenames are deterministic, the Factory can also generate stable documentation links and reverse references automatically.

14. Extending the Production Line Pattern (FOGE, Zara, Registries)

The Ruleset Factory establishes a reusable production-line pattern:

Intent → Canonical Dataset
Deterministic Rendering
PLAN (reviewable ChangeSet)
APPLY (transactional activation)

This same pattern applies to:

table_registry
signal_registry
framework metric datasets (FOGE)
Zara requirement mappings
validation rule catalogs

The scalable approach is:

Agents populate a canonical JSON dataset
A deterministic renderer produces structured outputs (JSON, YAML, Excel, etc.)
Registry updaters generate:
- SQL preview plans
- diff reports
- idempotent upsert scripts
- activation manifests

The core principle remains the same:

Canonical data → deterministic artifacts → reviewable ChangeSet → gated activation.

We build the pipeline once and reuse it across all governed registries.

15. Required Capabilities of the Ruleset Factory

To function as a governed production system, the Ruleset Factory must provide:

Strict PLAN / APPLY separation
Deterministic artifact rendering into associated-files
Canonicalization + hash-based identity
Idempotent, reviewable SQL generation
Support for:
- creating new rulesets
- superseding existing rulesets (lineage-aware)
- creating and updating bundles

Lineage management is first-class: superseding a ruleset must preserve traceability and historical integrity.

The Factory is not a YAML editor.
It is a deterministic artifact production system.

16. Intent-Driven Authoring (Not Hand-Written YAML)

Inputs to the Factory must be high-level intent — not fully authored YAML artifacts.

Example intent:

example-intent.yamlGitHub ↗
intent:
  engine: MEID_ACCT_CRAWLER
  ruleset_kind: reconciliation_policy
  tolerances:
    abs_tol: 5.0
    rel_tol: 0.0001
  enforcement_mode: blocking
  created_by: governance

The Factory renders the complete YAML artifact using the standard contract structure, including:

canonical header
CRID generation
schema reference
lifecycle metadata
compatibility bounds

This prevents:

manual drift
inconsistent header fields
CRID/version policy violations
schema non-compliance

Intent expresses what the rule should do. The Factory determines how it is structured and governed.

Why:

Removed casual phrasing
Strengthened separation-of-concerns principle
Explicitly lists what Factory fills in

17. Operational Flow

17.1. PLAN Review Model

plan.json is a deterministic output of PLAN and must be treated as read-only.

If something is incorrect, the workflow is:

Edit the Intent (or adjust advanced YAML preview if permitted)
Re-run PLAN
Review the new ChangeSet
APPROVE and APPLY

PLAN output is replayable and must never be manually edited.

The UI provides:

“Edit Intent”
“Re-run PLAN”
“Approve”
“Run APPLY”

17.2. Preflight Validation

After PLAN, the Factory executes:

zar.ruleset.factory.preflight.v1

Input:

plan.json

Output:

preflight_report.json

The report includes:

risk flags (e.g., blocking enforcement mode)
strict_mode mismatches
production approval requirements
drift detection against previous versions
policy violations (hard errors)
review focus checklist

The UI renders:

Warnings
Errors
A structured Review Checklist panel

APPLY is blocked if hard errors are present.

17.3. Deterministic Output Resolution

The output folder is derived — never user-specified.

Example:

MDX source_file: /computation-hub-calcs/micro-engines/tagged-accounting-crawler.mdx
Derived ruleset folder: /code/associated-files/computation-hub-calcs/micro-engines/tagged-accounting-crawler/rulesets/

The UI passes only:

slug
source_file
hub

The Factory resolves the artifact path deterministically.

17.4. Docusaurus UI Workflow

The UI follows a controlled sequence:

Intent Form (minimal, structured inputs)
Save Intent (optional)
Run PLAN
Review (multi-tab view):
- Summary
- Files (artifact previews)
- Diffs
- Registry Writes
- SQL Preview
- Warnings / Errors (Preflight)
Explicit Approval
Run APPLY

17.5. Docusaurus UI

The clean UX pattern is:

Intent Form (simple fields)
Save → writes intent YAML
Submit → runs PLAN
Review page with Tabs:

Summary
Files (generated previews)
Diffs
Registry Writes
SQL Preview
Warnings/Errors (and Preflight Checklist)

Approve → runs APPLY

17.6. UI component contract

Below is a practical contract that keeps the UI simple and makes the factory deterministic.

UI → PLAN request contract (JSON)

plan-request-contract.jsonGitHub ↗
{
  "action": "PLAN",
  "ui_context": {
    "source_doc": {
      "slug": "/micro-engines/tagged-accounting-crawler",
      "source_file": "/computation-hub-calcs/micro-engines/tagged-accounting-crawler.mdx",
      "hub": "computation-hub-calcs"
    },
    "actor": "pedersen@viroway.com",
    "env": "dev"
  },
  "intent_input": {
    "intent_id": "INTENT-ACCT-CRAWLER-DEFAULT-2026-02-20",
    "applies_to_meid": "MEID_ACCT_CRAWLER",
    "ruleset_family": "acct_crawler",
    "scope": {
      "target": "global",
      "tenant_id": null,
      "entity_id": null
    },
    "governance": {
      "created_by": "governance",
      "owners": ["cto@viroway.com"],
      "status": "draft",
      "changelog": "Initial default acct crawler rulesets"
    },
    "bundle": {
      "create_bundle": true,
      "bundle_name": "acct_crawler_default",
      "strict_mode": false,
      "allow_tenant_overrides": true,
      "execution_order": [
        "acct_crawler_tag_detection",
        "acct_crawler_classification",
        "acct_crawler_reconciliation_policy"
      ]
    },
    "rulesets": [
      {
        "artifact_name": "acct_crawler_tag_detection",
        "ruleset_kind": "tag_detection",
        "zrr": {
          "rule_type": "tagging",
          "domain": "finance",
          "framework": "global",
          "topic": "transition",
          "profile": "decarb",
          "severity": "WARNING",
          "enforcement_mode": "soft",
          "fallback_logic": "none",
          "linked_frameworks": ["GLOBAL"]
        },
        "compatibility": {
          "min_schema_ref": "ZAR:schema:canonical_gl_entry@v1",
          "max_schema_ref": "ZAR:schema:canonical_gl_entry@v1"
        },
        "rules": {
          "precedence": ["gl_attribute", "project_code", "cost_center"],
          "tag_sources": {
            "gl_attribute": { "field": "gl_attribute", "accepted_values": ["DECARB_CAPEX", "DECARB_OPEX", "DECARB"] },
            "project_code": { "field": "project_code", "patterns": [".*-DECARB-.*", "^ZYZ-DECARB-.*"] },
            "cost_center": { "field": "cost_center", "accepted_values": ["ESG-TRANSITION"] }
          },
          "thresholds": { "min_tagged_fields_present": 1 }
        }
      }
    ]
  }
}

Notes

UI does not send output folder. Factory derives it from source_file.
UI does not need to send CRID. Factory auto-generates if omitted.
UI may optionally send a “suggested_version” but factory computes bump suggestions.

PLAN → UI response contract

ui-response-contract.jsonGitHub ↗
{
  "ok": true,
  "plan_ref": "zar-plan://PLAN-123",
  "plan": {
    "id": "PLAN-123",
    "steps": [],
    "metadata": {}
    },
  "preflight": {
    "ok": true,
    "warnings": [],
    "errors": [],
    "review_checklist": [
      "Verify enforcement_mode for reconciliation is blocking (integrity-layer only).",
      "Verify canonicalization spec ref is correct.",
      "Verify derived output path matches source_file base folder."
    ]
  }
}

Note: plan_ref identifies an immutable PLAN artifact.
The APPLY step must reference this exact identifier.

UI → APPLY request contract

apply-request-contract.jsonGitHub ↗
{
  "action": "APPLY",
  "ui_context": {
    "actor": "...",
    "env": "...",
    "approval": {
      "approved": true,
      "approved_by": "...",
      "approved_at": "...",
      "comment": "...",
      "signature_type": "ui" 
    }
  },
  "plan_ref": "zar-plan://PLAN-123",
  "apply_options": {
    "write_files": true,
    "write_registry_rows": true,
    "write_platform_config": true,
    "dry_run": false
  }
}

APPLY → UI response contract

apply-ui-response-contract.jsonGitHub ↗
{
  "ok": true,
  "result": {
    "written_files": [
      {
        "path": "/workspaces/zayaz-docs/code/associated-files/computation-hub-calcs/micro-engines/tagged-accounting-crawler/rulesets/acct-crawler-tag-detection-1_0_0.yaml",
        "zar_ref": "ZAR:ruleset:acct_crawler_tag_detection@sha256:..."
      }
    ],
    "registry_updates": {
      "ruleset_registry": 3,
      "ruleset_bundle_registry": 1,
      "platform_config": 1
    }
  }
}

17.7. React Docusaurus component: RulesetGenerator.tsx

This is a self-contained Docusaurus page/component that implements:

IntentForm (with auto-filled hidden fields)
Save Intent (optional)
Run PLAN
ReviewTabs (Summary / Files / Registry / SQL / Warnings)
Approve + Run APPLY

It assumes the backend exposes two endpoints (can be renamed later):

POST /api/zar/factory/plan → returns { ok, plan_ref, plan }
POST /api/zar/factory/apply → returns { ok, result }

The UI is orchestration-agnostic. The backend implementation (local engine, MCP tools, or API orchestration) can evolve without changing the UI contract.

18 Factory Capability Roster & Backend Architecture

The ZAR Ruleset Factory consists of:

A thin HTTP backend (control plane for UI and automation)
A deterministic execution layer (MCP tool server or equivalent modular engine)

The backend exposes stable contracts. The execution layer performs governed artifact production.

18.1 Capability Roster

The Factory is composed of clearly separated capability roles.

These roles may be implemented as MCP tools, modules, or internal services — but their responsibilities must remain explicit.

A. Intent & Specification Layer

These components interpret and normalize intent before artifact generation.

1. Intent Interpreter

Input: Structured intent (UI JSON or YAML)

Output: Normalized intent with derived fields

Responsibilities:

Fill deterministic defaults
Infer output paths from source_file
Resolve MEID, family, and ruleset kind
Enforce separation between user intent and governed structure

2. CRID Composer

Generates CRIDs according to policy:

<prefix>.<rule_type>.<domain>.<framework>.<topic>.<profile>.<SEVERITY>.<X_Y_Z>

Responsibilities:

Detect rule_type mismatch
Propose semantic version bump (PATCH / MINOR / MAJOR)
Validate naming consistency

CRID is policy-governed, not user-authored.

3. Ruleset Content Generator

Produces the rules section based on:

ruleset_kind
Engine profile (MEID)
Template library

It generates structured rule payloads but does not finalize identity or registry state.

4. Bundle Generator

Constructs ruleset_bundle artifacts:

Ensures execution_order integrity
Applies strict_mode and override policy
Validates bundle completeness

B. Governance & Validation Layer

These components enforce safety and policy.

5. Governance Gate

Enforces environment-specific constraints:

Production requires approved lifecycle state
strict_mode policy validation
Owner / approval requirements
Environment gating rules

APPLY is blocked until governance passes.

6. Preflight Validator (zar.ruleset.factory.preflight.v1)

Consumes plan.json and produces a structured preflight report.

Validations include:

Schema compliance
Canonicalization validity
Naming conventions
Compatibility ranges
CRID policy
Risk-level changes
Drift detection

Output:

Errors (blocking)
Warnings
Review checklist

C. Deterministic Execution Layer

These components must never invent content — they execute deterministically.

7. Canonicalizer / Hasher

Applies canonicalization specification
Computes SHA-256 hash
Ensures stable identity

Hash is computed only on canonical form.

8. Artifact Writer

Writes YAML artifacts into deterministic associated-files path
Produces file diff preview in PLAN
Commits or stages in APPLY

No user-specified paths allowed.

9. Registry Upserter

Derives registry state from canonical artifact content.

PLAN mode:

Generates idempotent SQL preview
Produces “would write” diff

APPLY mode:

Executes transactional upserts
Updates activation pointers last

10. ZAR Registrar

Constructs and validates ZAR references:

ZAR:ruleset:<name>@sha256:<hash>
ZAR:ruleset_bundle:<name>@sha256:<hash>

Ensures uniqueness and immutability guarantees.

11. Query & Listing Service

Supports:

Listing by MEID
Filtering by family / kind
Status filtering (draft / approved / frozen)
Tenant/entity override visibility

This powers the UI browse/search experience.

12. Export & Renderer

Provides formatted outputs:

CSV
XLSX
JSON
Structured previews

Reuses canonical registry rows as source of truth.

18.2 Backend & MCP Separation

Backend (Control Plane)

The HTTP backend exists to:

Authenticate requests
Enforce environment rules
Expose stable API contracts
Orchestrate execution tools

Example routes:

POST /api/zar/factory/plan
POST /api/zar/factory/apply
GET /api/zar/rulesets
GET /api/zar/bundles

The backend must remain thin.

It does not embed business logic.

MCP Tool Server (Execution Plane)

The MCP server hosts deterministic tools:

zar.ruleset.factory.plan.v1
zar.ruleset.factory.preflight.v1
zar.ruleset.factory.apply.v1
zar.ruleset.list.v1
zar.ruleset_bundle.factory.v1

These tools:

May be reused by CLI
May run in scheduled governance jobs
May power other registry pipelines (FOGE, Zara, etc.)

The backend calls MCP tools. MCP tools return structured results.

18.3 Architecture Overview

A. api.zayaz.io

Responsibilities:

Authentication (OIDC / internal tokens)
Request validation
Environment gating
Orchestration

Artifact storage options:

Git-based PR workflow
Object storage (S3)
Hybrid (object storage + registry DB)

B. Persistence Layer

PostgreSQL for:
- ruleset_registry
- ruleset_bundle_registry
- activation pointers
Object storage or Git for artifact bodies

Artifacts and registry rows are separable but traceable via content_hash.

18.4 Deployment Phases

Phase 1 — Local Development Mode

Local backend + proxy
Writes artifacts to repository
Generates PLAN + preflight
APPLY produces SQL preview only

No direct DB mutation.

Phase 2 — PR-Governed Activation

PLAN creates branch + commits artifacts
APPLY opens/updates PR
SQL migration/seed files included
Merge triggers deployment

This ensures review-before-activation discipline.

Phase 3 — Fully Governed Platform

Post-merge automation performs:

Schema validation
Canonicalization verification
ZAR registration
Registry upserts
Manifest regeneration

All operations remain auditable and deterministic.

Appendix — Repository Structure & Core Artifacts

This appendix defines the canonical repository layout for the ZAR Ruleset Factory and its governing schemas.

The structure is intentionally explicit to ensure:

Deterministic artifact discovery
Clear separation between contracts and generated artifacts
Stable schema versioning
Predictable integration with documentation

A.1. Repository Structure

A.1.1. Factory Core (Contract Definitions)

Primary Factory contracts and orchestration artifacts:

/code/zar-core/

ZAR-specific contract artifacts:

/code/zar-core/zar/

These files define:

Intent schema contracts
PLAN/APPLY contracts
Canonicalization specifications
Registry registration artifacts
Integrity reporting contracts

A.1.2. Ruleset & Bundle Artifacts

Executable rulesets and bundles are stored under:

/code/associated-files/<doc-derived-path>/rulesets/

The path is derived deterministically from the MDX source_file.

This guarantees:

Documentation and artifacts remain aligned
Snippet components render the registered YAML
No manual path configuration is required

A.2. Core YAML Contracts

Location:

/code/zar-core/zar/

Key artifacts include:

intent-format-1_0_0.yaml
zar-ruleset-factory-plan-1_0_0.yaml
zar-ruleset-factory-apply-1_0_0.yaml

ruleset-plan-1_0_0.yaml
ruleset-apply-1_0_0.yaml

ruleset-bundle-plan-1_0_0.yaml
ruleset-bundle-apply-1_0_0.yaml

register-any_artifact-1_0_0.yaml
zar-canonicalization-default.v1.yaml

integrity_check_report-register-1_0_0.yaml
integrity_exception-register-1_0_0.yaml

Notes:

intent-format-1_0_0.yaml defines the high-level input contract.
zar-ruleset-factory-plan-1_0_0.yaml and zar-ruleset-factory-apply-1_0_0.yaml define the Factory control surface.
Canonicalization specification ensures deterministic hashing.
Registration artifacts govern registry upserts.

These contracts are versioned and must remain backward-compatible within major versions.

GitHub Reference

Core artifacts: https://github.com/Viroway/zayaz-docs/tree/main/code/zar-core/zar

A.3. Schema Definitions

Location:

/schemas/zar/

Schema files:

canonicalization_spec.v1.schema.json
integrity_check_report.v1.schema.json
integrity_exception.v1.schema.json
ruleset.v1.schema.json
ruleset_bundle.v1.schema.json
ruleset_factory_intent.v1.schema.json
ruleset_factory_plan.v1.schema.json
ruleset_factory_preflight_report.v1.schema.json

These schemas enforce:

Structural validity
Canonicalization compliance
PLAN output integrity
Preflight validation guarantees

Schema versions are explicit (v1) and must evolve via additive or versioned changes.

A.4. Intent Contract

intent-format-1_0_0.yaml contains everything required to generate:

One or more rulesets
Optional bundles
Registry rows
ZAR references

It is:

Diffable under review
Stable under canonicalization
Deterministic in output generation

Intent is the only human-authored input required for artifact production.

A.5. Schema References (Rendered)

See the GitHub ZAR core artifacts

Schemas:

Canonicalization Spec - Schema (v1)GitHub ↗
[SchemaSnippet] Loading schema "zar/canonicalization_spec.v1.schema.json" on client...

Integrity Check Report - Schema (v1)GitHub ↗
[SchemaSnippet] Loading schema "zar/integrity_check_report.v1.schema.json" on client...

Integrity Exception - Schema (v1)GitHub ↗
[SchemaSnippet] Loading schema "zar/integrity_exception.v1.schema.json" on client...

Ruleset - Schema (v1)GitHub ↗
[SchemaSnippet] Loading schema "zar/ruleset.v1.schema.json" on client...

Ruleset Bundle - Schema (v1)GitHub ↗
[SchemaSnippet] Loading schema "zar/ruleset_bundle.v1.schema.json" on client...

Ruleset Factory Intent - Schema (v1)GitHub ↗
[SchemaSnippet] Loading schema "zar/ruleset_factory_intent.v1.schema.json" on client...

The Ruleset Factory Plan schema is for the PLAN output. It’s intentionally “minimal but complete”: it captures everything needed to (a) review, (b) preflight, and (c) apply deterministically.

Ruleset Factory Plan - Schema (v1)GitHub ↗
[SchemaSnippet] Loading schema "zar/ruleset_factory_plan.v1.schema.json" on client...

The Ruleset Factory Preflight Report schema supports strict pass/warn/fail, includes fix hints, and renders cleanly in a UI.

Ruleset Factory Preflight Report - Schema (v1)GitHub ↗
[SchemaSnippet] Loading schema "zar/ruleset_factory_preflight_report.v1.schema.json" on client...

The Ruleset Factory Plan schema captures everything required for:

Deterministic review
Preflight validation
Replayable APPLY

The Preflight Report schema supports structured pass / warn / fail semantics and UI rendering.

AWS Ruleset Signature Setup

GitHub Action
     │
     │ OIDC
     ▼
IAM Role (zape-github-prod-signer)
     │
     │ kms:Sign
     ▼
AWS KMS (ECDSA key)
     │
     │ returns signature
     ▼
signRulesets.mjs
     │
     ▼
rulesets YAML updated with zar.signature
     │
     ▼
aws s3 sync → S3 bucket

The ruleset catalog generator verifies signatures using truststore public keys.

We have four security layers:

Layer	Purpose
IAM role	GitHub identity
OIDC trust policy	restrict repo/branch
KMS signing	artifact integrity
truststore verification	runtime validation

GitHub Repo Request for Change (RFC)

1. Introduction​

2. YAML is the executable truth — and must be named deterministically​

3. Design Principles​

3.1. Single Source of Truth​

3.2. Determinism Over Magic​

3.3. Review Before Activation​

3.4. Separation of Intent and Execution​

3.5. Governance by Construction​

4. Core Guarantees​

5. Threat Model & Failure Modes​

5.1. Drift Between Artifact and Registry​

5.2. Partial Activation​

5.3. Non-Deterministic Output​

5.4. Unauthorized Production Changes​

5.5. Versioning Ambiguity​

5.6. Silent Breaking Changes​

5.7. Schema Evolution Risk​

6. Formal Invariants​

7. Lifecycle State Machine (Rulesets & Bundles)​

8. Component Interaction Sequence​

9. The ZAR Ruleset Factory “production line”​

10. Factory Command Surface (simple, future-proof)​

11. Governance Guardrails (Non-Negotiable)​

12. Artifact Storage Model (Docs-Aligned, Deterministic)​

13. Deterministic Registry Auto-Generation​

14. Extending the Production Line Pattern (FOGE, Zara, Registries)​

15. Required Capabilities of the Ruleset Factory​

16. Intent-Driven Authoring (Not Hand-Written YAML)​

17. Operational Flow​

17.1. PLAN Review Model​

17.2. Preflight Validation​

17.3. Deterministic Output Resolution​

17.4. Docusaurus UI Workflow​

17.5. Docusaurus UI​

17.6. UI component contract​

17.7. React Docusaurus component: RulesetGenerator.tsx​

18 Factory Capability Roster & Backend Architecture​

18.1 Capability Roster​

1. Intent Interpreter​

2. CRID Composer​

3. Ruleset Content Generator​

4. Bundle Generator​

5. Governance Gate​

6. Preflight Validator (zar.ruleset.factory.preflight.v1)​

7. Canonicalizer / Hasher​

8. Artifact Writer​

9. Registry Upserter​

10. ZAR Registrar​

11. Query & Listing Service​

12. Export & Renderer​

18.2 Backend & MCP Separation​

18.3 Architecture Overview​

18.4 Deployment Phases​

Phase 1 — Local Development Mode​

Phase 2 — PR-Governed Activation​

Phase 3 — Fully Governed Platform​

Appendix — Repository Structure & Core Artifacts​

A.1. Repository Structure​

A.1.1. Factory Core (Contract Definitions)​

A.1.2. Ruleset & Bundle Artifacts​

A.2. Core YAML Contracts​

A.3. Schema Definitions​

A.4. Intent Contract​

A.5. Schema References (Rendered)​

Schemas:​

AWS Ruleset Signature Setup​

1. Introduction

2. YAML is the executable truth — and must be named deterministically

3. Design Principles

3.1. Single Source of Truth

3.2. Determinism Over Magic

3.3. Review Before Activation

3.4. Separation of Intent and Execution

3.5. Governance by Construction

4. Core Guarantees

5. Threat Model & Failure Modes

5.1. Drift Between Artifact and Registry

5.2. Partial Activation

5.3. Non-Deterministic Output

5.4. Unauthorized Production Changes

5.5. Versioning Ambiguity

5.6. Silent Breaking Changes

5.7. Schema Evolution Risk

6. Formal Invariants

7. Lifecycle State Machine (Rulesets & Bundles)

8. Component Interaction Sequence

9. The ZAR Ruleset Factory “production line”

10. Factory Command Surface (simple, future-proof)

11. Governance Guardrails (Non-Negotiable)

12. Artifact Storage Model (Docs-Aligned, Deterministic)

13. Deterministic Registry Auto-Generation

14. Extending the Production Line Pattern (FOGE, Zara, Registries)

15. Required Capabilities of the Ruleset Factory

16. Intent-Driven Authoring (Not Hand-Written YAML)

17. Operational Flow

17.1. PLAN Review Model

17.2. Preflight Validation

17.3. Deterministic Output Resolution

17.4. Docusaurus UI Workflow

17.5. Docusaurus UI

17.6. UI component contract

17.7. React Docusaurus component: RulesetGenerator.tsx

18 Factory Capability Roster & Backend Architecture

18.1 Capability Roster

1. Intent Interpreter

2. CRID Composer

3. Ruleset Content Generator

4. Bundle Generator

5. Governance Gate

6. Preflight Validator (zar.ruleset.factory.preflight.v1)

7. Canonicalizer / Hasher

8. Artifact Writer

9. Registry Upserter

10. ZAR Registrar

11. Query & Listing Service

12. Export & Renderer

18.2 Backend & MCP Separation

18.3 Architecture Overview

18.4 Deployment Phases

Phase 1 — Local Development Mode

Phase 2 — PR-Governed Activation

Phase 3 — Fully Governed Platform

Appendix — Repository Structure & Core Artifacts

A.1. Repository Structure

A.1.1. Factory Core (Contract Definitions)

A.1.2. Ruleset & Bundle Artifacts

A.2. Core YAML Contracts

A.3. Schema Definitions

A.4. Intent Contract

A.5. Schema References (Rendered)

Schemas:

AWS Ruleset Signature Setup