ZRR
ZAYAZ Ruleset Registry
1. Purpose and Positioning
The ZAYAZ Ruleset Registry (ZRR) formalizes every enforceable logic construct within the ZAYAZ platform into a first-class, versioned, auditable governance object.
ZRR transforms rules from implicit code fragments into explicit, declarative, traceable artifacts governed under the ZAR (ZAYAZ Artifact Registry) framework.
ZRR ensures that:
- Every validation rule is addressable.
- Every compliance rule is traceable to framework obligations.
- Every computation rule is version-controlled.
- Every enforcement action is auditable.
- Every rule lifecycle event is logged.
- Every rule is compatible with CMCB schema governance.
- Every rule execution can be explained to verifiers, regulators, and stakeholders.
ZRR is the enforcement abstraction layer between:
- Signals (SSSR)
- Ontology (USO)
- Engines (MICE)
- Dispatcher (ZSSR)
- Validation Layer (DICE / DaVE / VTE)
- Audit Layer (ALTD / DAL)
- AI Governance Systems
ZRR enables ZAYAZ to operate as a declarative ESG logic platform — not merely a SaaS interface.
2. Canonical Rule Identifier (CRID) Architecture
Every rule registered in ZRR must possess a Canonical Rule Identifier (CRID).
CRID Format
<crid_prefix>.<rule_type>.<domain>.<framework>.<topic>.<profile>.<SEVERITY>.<X_Y_Z>
^ruleset\.(validation|computation|transformation|aggregation|classification|tagging|governance|risk_mapping|ai_assisted)\.[a-z0-9][a-z0-9_-]{0,39}\.[a-z0-9][a-z0-9_-]{0,39}\.[a-z0-9][a-z0-9_-]{0,79}\.[a-z0-9][a-z0-9_-]{0,79}\.(INFO|WARNING|CRITICAL|BLOCKING)\.(0|[1-9][0-9]*)_(0|[1-9][0-9]*)_(0|[1-9][0-9]*)$
Examples
ruleset.validation.ghg.esrs.e1-6.standard.CRITICAL.1_0_0
ruleset.computation.ghg.global.scope3-category4.standard.BLOCKING.2_1_0
ruleset.governance.gov.csrd.g1-4.policy-required.CRITICAL.1_0_0
ruleset.transformation.meta.global.unit-conversion.standard.INFO.3_0_2
CRID Components
| Component | Description |
|---|---|
crid_prefix | Fixed prefix identifying CRID namespace. Must be ruleset for ruleset artifacts. |
rule_type | One of: validation, computation, transformation, aggregation, classification, tagging, governance, risk_mapping, ai_assisted. Must match zrr.rule_type. |
domain | Domain code (lowercase) such as ghg, finance, gov, meta, waste, etc. |
framework | Framework namespace such as esrs, csrd, gri, global, iso14001. |
topic | The subject identifier: a clause, metric family, or logical object like e1-6, scope3-category4, g1-4, unit-conversion. |
profile | Variant/profile of the ruleset logic (e.g., standard, policy-required, strict, tenant-acme, sector-mining). |
SEVERITY | Enforcement severity enum: INFO, WARNING, CRITICAL, BLOCKING (UPPERCASE). |
X_Y_Z | Ruleset semantic version encoded as major_minor_patch (e.g., 1_0_0). |
CRIDs are immutable.
A new version always generates a new CRID.
3. Rule Classification Model
Every rule must be classified into one of the following categories:
| Ruleset Type | Description |
|---|---|
| Validation | Range checks, completeness checks, structural validation |
| Computation | Mathematical transformations, derived metrics |
| Transformation | Unit conversion, structural normalization |
| Aggregation | Multi-signal summarization |
| Governance | Policy requirement enforcement |
| Risk Mapping | Risk categorization and threshold enforcement |
| AI-Assisted | AI-based inference or extrapolation rules |
Each ruleset must define:
- Enforcement Mode
- Severity Level
- Execution Engine Binding
- Fallback Logic
- Audit Requirements
4. ZRR Core Schema (Logical Model)
ZRR is implemented as a formal registry object under ZAR.
ruleset_registry
| Column | Type | Description |
|---|---|---|
zar_ruleset_ref | text (PK) | ZAR:ruleset:<artifact_name>@sha256:<hash> immutable id |
rule_id | text | CRID (canonical rule identifier), from zrr.crid |
rule_name | text | Human title; recommend adding zrr.title |
scope | text | `global |
tenant_id | text null | tenant scope id |
entity_id | text null | entity scope id |
domain | text | from zrr.domain |
ruleset_kind | text | e.g. `tag_detection |
rule_type | text | from zrr.rule_type (Validation/Computation/…) |
linked_signal_ids | jsonb | array of signal ids |
linked_frameworks | jsonb | array of frameworks |
severity_level | text | INFO/WARNING/CRITICAL/BLOCKING |
enforcement_mode | text | advisory/soft/hard/blocking/audit_only |
execution_engine | text | MEID that executes it (zrr.execution_engine) |
fallback_logic | text | none/manual_escalation/… |
ontology_binding | text null | reference (USO etc.) |
schema_min_ref | text null | from compatibility.min_schema_ref |
schema_max_ref | text null | from compatibility.max_schema_ref |
version | text | semver (explicit), or derived from CRID tail |
lifecycle_status | text | draft/approved/frozen/superseded/deprecated |
audit_required | boolean | from policy/constraints |
zar_content_hash | text | sha256 hash (redundant but useful) |
created_by | text | system/governance/admin |
created_at | timestamptz | artifact lifecycle created time |
updated_at | timestamptz | registry update timestamp |
ZRR is part of the ZAR Artifact Layer and inherits:
- Version lineage tracking
- DAL registration
- Federation readiness
- Assurance cloud propagation
Field mapping: YAML → ruleset_registry
| ruleset_registry field | Source |
|---|---|
rule_id | zrr.crid (ruleset.…__) |
rule_name | zar.artifact_name or add zrr.title (recommended) |
| domain | zrr.domain |
rule_type | zrr.rule_type (validation/computation/classification/tagging…) |
linked_signal_ids | zrr.linked_signal_ids |
linked_frameworks | zrr.linked_frameworks |
severity_level | zrr.severity |
execution_engine | zrr.execution_engine |
schema_version_binding | compatibility.min_schema_ref + compatibility.max_schema_ref (store both) |
version | derived from CRID tail X_Y_Z (or store explicitly as zrr.version if preferred) |
active_status | lifecycle.status mapped: approved→active, draft→experimental, deprecated→deprecated |
created_at | lifecycle.created_at |
updated_at | set by registry service on ingest/update |
5. Rule Binding Architecture
Every rule must bind to at least one of the following:
- Signal (
signal_id) - Framework obligation
- Ontology node (USO)
- Micro Engine (
MEID) - Validation engine (DICE / DaVE)
- Agent profile (optional)
- AI risk tier (if applicable)
This creates full traceability across:
Signal → Rule → Engine → Enforcement → Audit Log → Assurance Ledger
No rule may execute outside this binding model.
6. Rule Execution Logging
All rule executions must generate a Rule Execution Event.
rule_execution_log (event table)
| Column | Type | Description |
|---|---|---|
execution_id | uuid (PK) | unique execution event |
zar_ruleset_ref | text | exact artifact executed |
rule_id | text | CRID |
meid | text | engine id that executed |
signal_id | text null | if single-signal event |
signal_ids | jsonb null | if multi-signal event |
execution_timestamp | timestamptz | UTC |
execution_result | text | pass/fail/warn |
enforcement_action_taken | text | none/notify/block/escalate |
confidence_score | double precision null | 0–1 for AI-assisted |
override_flag | boolean | whether an override applied |
overridden_by | text null | identity |
audit_hash | text null | integrity anchor |
Execution logs are stored in:
- ALTD (Audit Logging & Tamper Detection)
- DAL (Digital Assurance Ledger)
- Federation export (if applicable)
Rule logs are immutable.
7. Integrity Governance Model
7.1. Scope and Object of Status
Integrity status applies at three levels (all three can exist):
- Job Integrity (
job_integrity_status) - Dataset Integrity (
dataset_integrity_status) — e.g., canonical GL set for a period/entity - Artifact Integrity (
artifact_integrity_status) — e.g.,transition_project_cost_profile
Normative Rule:
Only dataset_integrity_status controls regulatory reporting eligibility.
Artifact-level integrity may be stricter but cannot override dataset eligibility.
7.2. Canonical Status Enum
7.2.1. Primary states (normative)
| State | Meaning | Reporting eligible? |
|---|---|---|
PENDING | Job started; integrity not assessed yet | ❌ |
IN_PROGRESS | Canonicalization or checks running | ❌ |
PASSED | All blocking and hard checks satisfied | ✅ |
PASSED_WITH_WARNINGS | Passed but has warnings (soft/hard warnings below thresholds) | ✅ (flagged) |
FAILED | Integrity checks failed beyond tolerance / blocking rules triggered | ❌ |
PASSED_WITH_EXCEPTION | Failed originally, but a governed exception is approved and attached | ✅ (flagged + exception) |
REVOKED | Previously passed/exceptioned status invalidated (ruleset change, exception expiry, data change) | ❌ |
7.2.2. Optional operational states (if useful)
- CANCELLED (job cancelled)
- ERROR (system/runtime error; not a data-integrity conclusion)
7.3. Deterministic Inputs to State Transitions
Integrity status is computed from a set of check results produced by rulesets:
- Tag detection checks (
MEID_ACCT_CRAWLERtag ruleset) - Classification checks (
MEID_ACCT_CRAWLERclassification ruleset) - Reconciliation checks (
MEID_ACCT_CRAWLERreconciliation ruleset)
Each check yields:
result:PASS|WARN|FAILenforcement_mode:advisory|soft|hard|blockingmetrics: diffs, counts, ratioscridruleset_reftimestamp
7.4. Transition Rules
7.4.1. Start / execution
PENDING → IN_PROGRESSTrigger: JobStartedIN_PROGRESS → FAILEDTrigger: any blocking check FAIL and no valid exception exists.IN_PROGRESS → PASSED_WITH_WARNINGSTrigger: no blocking FAILs, but one or more soft/hard warnings exist.IN_PROGRESS → PASSEDTrigger: all checks PASS (or only advisory warnings).
7.4.2. Exception handling (governed override)
FAILED → PASSED_WITH_EXCEPTIONTrigger: IntegrityExceptionApproved event with:
- matching
job_id - matching
failed_rule_crid - matching
ruleset_ref(or compatible lineage rule) - exception not expired
PASSED_WITH_WARNINGS → PASSED_WITH_EXCEPTION(rare)
Trigger: exception applied to a warning that is treated as blocking for reporting in a specific context (e.g., EU filing requires strict mode). Optional, only if if/when we support “strict compliance mode.”
7.4.3. Revocation (critical for replay + audit)
PASSED → REVOKEDTriggers (any):
- dataset content hash changed (re-ingest, re-canonicalize)
ruleset_refchanged for integrity-critical rulesets (reconciliation/classification)- schema compatibility boundary violated
- manual tamper detection / DAL violation
Revocation does not mutate historical artifacts. It creates a new integrity evaluation event referencing the same dataset with updated governance context.
-
PASSED_WITH_WARNINGS → REVOKEDSame triggers as above. -
PASSED_WITH_EXCEPTION → REVOKEDTriggers (any):
- exception expired
- exception deprecated/revoked
- job/dataset hash changed
- referenced
ruleset_refchanged - exception evidence invalidated (optional governance action)
7.4.4. Re-run / recomputation
REVOKED → IN_PROGRESSTrigger: job replay/recompute requested using a specificruleset_refset.
7.5. enforcement_mode
7.5.1. What “enforcement_mode” Actually Controls:
It defines:
What happens when the ruleset determines a failure condition. So the key is: what exactly gets blocked?
There are three different “layers” where blocking can apply:
| Layer | What gets blocked |
|---|---|
| Engine layer | The microservice job execution |
| Artifact layer | Publication of output artifacts |
| Integrity layer | “Integrity = PASSED” status |
These are very different in terms of operational impact.
The Four Modes — Operational Meaning
🟢 advisory
- Rule failure logged
- No flags
- No status change
- No UI impact
Use when:
- Low-risk signals
- Informational rules
🟡 soft
- Rule failure logged
- Integrity flag attached
- Warning surfaced in UI/API
- Artifacts still published
Use when:
- Data quality issue
- Not materially misleading
- CFO review recommended
🟠 hard
- Rule failure logged
- Integrity status downgraded
- Must acknowledge or override
- Artifacts published but marked non-compliant
Use when:
- Financial classification uncertainty
- High unknown ratio
- Partial reconciliation
This is typically the sweet spot for most finance policy rules.
🔴 blocking
- Rule failure logged
- Artifact cannot be marked “valid”
- Either:
- Prevent publishing entirely
- OR publish but status = INVALID and not eligible for downstream reporting
- Requires manual resolution before progressing workflow
Use when:
- Ledger integrity failure
- Double counting risk
- Currency mismatch without FX policy
- Schema incompatibility
- Reconciliation beyond tolerance
| Ruleset | enforcement_mode | Interpretation |
|---|---|---|
| Tag detection | soft | Flag if tagging weak |
| Classification | hard | Must review if unknown high |
| Reconciliation | blocking | Block integrity_passed status |
Blocking in the ZRR means:
“Prevents dataset from achieving integrity_passed status and from being used in compliance-critical outputs until resolved or formally overridden.”
7.5.2. Integrity Exception Framework
We treat overrides as Governance Events. An override must:
- Never delete the original failure
- Never change the ruleset
- Never mutate the underlying data
- Create a separate, immutable governance artifact
Override = “accept deviation with documented justification”
An override does not convert a FAIL into PASS. It produces a distinct state:
PASSED_WITH_EXCEPTION.
The following artifact is a first-class governance object.
ZAR:integrity_exception:<entity>:<job_id>@sha256:<hash>
YAML name:
integrity-exception--<job_id>--<ruleset_family>--1_0_0.yaml
Artifact Index Table: zar_artifact_index
-- ZAR Artifact Index (minimal)
-- Purpose: fast lookup of ANY ZAR artifact by ref, and filtering by tenant/entity/job/type.
CREATE TABLE zar_artifact_index (
zar_ref TEXT PRIMARY KEY, -- e.g. ZAR:ruleset:acct_crawler_tag_detection@sha256:...
artifact_type TEXT NOT NULL, -- ruleset | schema | integrity_check_report | integrity_exception | dataset | event | ...
artifact_name TEXT NOT NULL, -- human/stable name, not necessarily unique globally
content_hash TEXT NOT NULL, -- sha256:<64hex>
schema_ref TEXT NOT NULL, -- e.g. ZAR:schema:integrity_check_report@v1
tenant_id TEXT NULL, -- null => global artifact
entity_id TEXT NULL, -- null => not entity-scoped
job_id TEXT NULL, -- populated for execution artifacts
run_id TEXT NULL, -- retry/attempt id if applicable
created_at TIMESTAMPTZ NOT NULL DEFAULT now(),
created_by TEXT NULL, -- system|governance|admin|user OR user id/email
lifecycle_status TEXT NULL, -- draft|approved|deprecated (mainly for rulesets/schemas); null for immutable outputs
storage_uri TEXT NOT NULL, -- canonical storage location (S3/GCS/file/DB blob ref)
size_bytes BIGINT NULL,
mime_type TEXT NULL, -- application/yaml, application/json, etc.
metadata JSONB NOT NULL DEFAULT '{}'::jsonb -- optional indexing hints, non-authoritative
);
-- Useful indexes
CREATE INDEX idx_zar_artifact_type ON zar_artifact_index (artifact_type);
CREATE INDEX idx_zar_tenant_entity ON zar_artifact_index (tenant_id, entity_id);
CREATE INDEX idx_zar_job_run ON zar_artifact_index (job_id, run_id);
CREATE INDEX idx_zar_hash ON zar_artifact_index (content_hash);
7.5.3. Minimum required fields for an override:
zar:
artifact_type: integrity_exception
artifact_name: integrity_exception_JOB-XYZ-123
applies_to_meid: MEID_ACCT_CRAWLER
zrr:
crid: ruleset.validation.finance.global.reconciliation.override.standard.critical.1_0_0
rule_type: governance
domain: finance
severity: critical
execution_engine: MEID_ACCT_CRAWLER
enforcement_mode: blocking
audit_required: true
lifecycle:
status: approved
created_by: governance
owners:
- finance.controller@client.com
approved_by:
- finance.controller@client.com
created_at: 2026-03-02T12:34:00Z
supersedes: null
deprecated_by: null
changelog: "Approved reconciliation exception for timing difference."
exception_scope:
level: job
reporting_eligibility_restored: true
context:
exception_id: INT-EXC-2026-0001
job_id: JOB-XYZ-123
ruleset_ref: ZAR:ruleset:acct_crawler_reconciliation_policy@sha256:abc
failed_rule_crid: ruleset.validation.finance.global.reconciliation.standard.critical.1_0_0
failure_snapshot:
entity_diff_abs: 12.45
entity_diff_rel: 0.0008
failed_accounts:
- "6100"
- "6200"
justification: >
Timing difference due to late journal posting.
Verified against closing Trial Balance extract.
supporting_evidence_refs:
- ledger_export_2026_01_v2.pdf
- tb_reconciliation_workpaper.xlsx
risk_assessment: immaterial
expiry_policy:
scope: reporting_year
valid_for: 2026
This ensures:
- Full traceability
- Snapshot of failure context
- Evidence attachment
- Role-based approval
- Time-bound validity
7.5.4. Runtime Logic After Override
If exception exists and is valid:
integrity_status = PASSED_WITH_EXCEPTIONreporting_eligible = trueaudit_flag = trueoverride_flag = true
Downstream artifacts must include:
"integrity_status": "PASSED_WITH_EXCEPTION",
"exception_ref": "ZAR:integrity_exception:entity@sha256:xyz"
Auditor can then inspect the exception object.
7.5.5. Auditor-Safe Design Principles
To satisfy assurance requirements:
- Override Must Be Additive
Original failure event remains immutable.
- Override Must Be Linked to Specific Job + Ruleset
If ruleset changes → override invalid.
- Override Must Expire
Cannot carry forward silently into next year.
- Override Must Be Role-Restricted
Only:
- Controller
- CFO
- Designated governance role
- Override Must Be Logged in DAL
Every override creates:
- RuleExecutionEvent (override=true)
- DAL hash entry
- Governance event log
7.5.6. Safe “Fix” vs “Override”
There are two types of resolution:
A. Data Fix (Preferred)
- Correct connector config
- Re-run extraction
- Re-run reconciliation
- Pass legitimately
No override needed.
B. Accepted Deviation (Controlled Exception)
Used only when:
- Timing differences
- FX rounding immaterial
- Known ERP quirks
- Minor TB extraction limitations
This is where exception artifact applies.
7.5.7. UX Design (Important)
When reconciliation fails, UI must show:
Integrity Panel
- Difference amount
- Threshold
- Failed accounts
- Root cause category suggestion
Options:
- Fix data and re-run
- Adjust ruleset (requires governance approval)
- Create exception (requires controller approval)
This ensures:
✔ Structured workflow ✔ No shadow overrides ✔ No hidden toggles
7.5.8. Strategically Powerful
With this system, ZAYAZ can say:
“All integrity deviations are explicitly governed, time-bound, evidence-backed, and auditable.”
- That is exactly what auditors want.
- We are not suppressing issues.
- We are documenting them formally.
7.5.9. Should Overrides Modify Tolerances?
No.
Never mutate ruleset automatically.
If tolerance is wrong:
- Create new ruleset version
- Activate new version
- Re-run job
That maintains replay safety.
7.6. State Machine Diagram (textual)
PENDING
└──(JobStarted)──> IN_PROGRESS
├──(IntegrityEvaluated: Blocking FAIL, no exception)──> FAILED
│ └──(IntegrityExceptionApproved & valid)──> PASSED_WITH_EXCEPTION
├──(IntegrityEvaluated: No blocking FAIL, warnings exist)──> PASSED_WITH_WARNINGS
└──(IntegrityEvaluated: All PASS)──> PASSED
PASSED ──(dataset_hash change OR bundle_ref/ruleset_ref change OR schema_ref incompatibility)──> REVOKED
PASSED_WITH_WARNINGS ──(dataset_hash change OR bundle_ref/ruleset_ref change OR schema_ref incompatibility)──> REVOKED
PASSED_WITH_EXCEPTION ──(exception expiry OR exception revoked OR dataset_hash/bundle_ref change)──> REVOKED
REVOKED ──(Replay/Recompute requested)──> IN_PROGRESS
Pipeline alignment:
JobStarted
↓
Canonicalized
↓
Reconciled
↓
IntegrityEvaluated
↓
Published (publish_allowed may be true even if FAILED)
> Key: we don’t “block artifact generation”; we block **reporting eligibility** (integrity_passed/reporting_eligible).
7.7. Integrity “Blocking” Semantics
Blocking prevents a dataset from reaching PASSED status and from being marked reporting_eligible = true. Blocking does not prevent artifact generation unless a system-level ERROR occurs.
The formal rule is:
- Artifact generation: always allowed (unless system error)
- Reporting eligibility: depends on integrity status
Reporting eligibility mapping
| Integrity status | reporting_eligible | publish_allowed |
|---|---|---|
PASSED | true | true |
PASSED_WITH_WARNINGS | true (flagged) | true |
PASSED_WITH_EXCEPTION | true (flagged + exception_ref required) | true |
FAILED | false | true |
REVOKED | false | false |
PENDING/IN_PROGRESS | false | false |
7.8. Required Metadata Fields (for audit + replay)
Every dataset/artifact must include:
integrity_statusintegrity_ruleset_bundle_refs(all ruleset refs used)integrity_check_results_ref(pointer to check report artifact)exception_ref(nullable; required ifPASSED_WITH_EXCEPTION)input_dataset_hashjob_idtimestamp
7.9. Events (align with the pipeline)
You already have: JobStarted, Canonicalized, Reconciled, Published.
Add two governance events:
IntegrityEvaluated(emits computed status + check summary ref)IntegrityExceptionApproved(links exception artifact ref)
Optional:
- IntegrityRevoked (when status becomes REVOKED)
7.10. Practical guardrails for auditors
To make overrides “safe and sound”:
PASSED_WITH_EXCEPTIONmust always carry:exception_reffailed_rule_cridruleset_refexpiry_policy- evidence refs
- Any change to dataset hash or ruleset ref forces REVOKED
- Recompute with original refs reproduces original status
7.11. Implementation notes
- Model as a pure function:
integrity_status = f(check_results, exception?, hashes, ruleset_refs, schema_refs, now)
So it’s deterministic and replayable.
- Store “check results” as a separate artifact:
ZAR:integrity_check_report:<job_id>@sha256:<hash>
This makes the evaluation explainable.
7.12. ZAR Integrity Check Report Artifact
7.12.1. Artifact definition
Artifact type
integrity_check_report
Normative invariants
- Must be content-addressable and stored in ZAR:
ZAR:integrity_check_report:<job_id>@sha256:<hash>- Must include:
job_idapplies_to_meiddataset_hash(hash of the canonicalized dataset being assessed)- all ruleset refs used
- all check results (including
PASS/WARN/FAIL) - computed
integrity_status(one of the canonical states)
- Must be replay-safe:
- same inputs + same rulesets + same
dataset_hash⇒ identical report content (except timestamps if you choose to exclude them from hash; see §4)
- same inputs + same rulesets + same
7.13. Schemas
7.13.1. Schema & Report Example
[SchemaSnippet] Loading schema "zar/integrity_check_report.v1.schema.json" on client...
Notes:
zar_refis the canonical ID. Everything else is query convenience.- metadata is non-authoritative (the artifact file is authoritative). It’s for search/filter performance and UI summaries.
Example integrity_check_report instance
{
"zar": {
"artifact_type": "integrity_check_report",
"artifact_name": "integrity_check_report_JOB-XYZ-123",
"applies_to_meid": "MEID_ACCT_CRAWLER",
"content_hash": "sha256:1111111111111111111111111111111111111111111111111111111111111111",
"schema_ref": "ZAR:schema:integrity_check_report@v1"
},
"scope": {
"level": "dataset",
"object_ref": "<ZAR:dataset:...@sha256:...>"
},
"context": {
"job_id": "JOB-XYZ-123",
"run_id": "RUN-XYZ-123-1",
"tenant_id": "TENANT-ACME",
"entity_id": "ENTITY-ACME-DE",
"generated_at": "2026-02-17T10:12:00Z",
"initiated_by": "system",
"mode": "standard"
},
"dataset": {
"dataset_type": "canonical_trial_balance",
"dataset_hash": "sha256:2222222222222222222222222222222222222222222222222222222222222222",
"schema_ref": "ZAR:schema:canonical_trial_balance@v1",
"period": { "start": "2026-01-01", "end": "2026-01-31", "reporting_year": 2026 },
"source_systems": ["netsuite"],
"record_counts": { "lines_total": 12045, "accounts_total": 412, "projects_total": 18 }
},
"rulesets": {
"bundle_ref": "ZAR:ruleset_bundle:acct_crawler_default@sha256:3333333333333333333333333333333333333333333333333333333333333333",
"resolved": [
{
"ruleset_ref": "ZAR:ruleset:acct_crawler_tag_detection@sha256:aaaa",
"artifact_name": "acct_crawler_tag_detection",
"crid": "ruleset.tag_detection.finance.global.transition.standard.warning.1_0_0",
"enforcement_mode": "soft"
},
{
"ruleset_ref": "ZAR:ruleset:acct_crawler_classification@sha256:bbbb",
"artifact_name": "acct_crawler_classification",
"crid": "ruleset.classification.finance.global.capex-opex.standard.warning.1_0_0",
"enforcement_mode": "hard"
},
{
"ruleset_ref": "ZAR:ruleset:acct_crawler_reconciliation_policy@sha256:cccc",
"artifact_name": "acct_crawler_reconciliation_policy",
"crid": "ruleset.validation.finance.global.reconciliation.standard.critical.1_0_0",
"enforcement_mode": "blocking"
}
]
},
"checks": [
{
"check_id": "TB_ENTITY_DIFF",
"crid": "ruleset.validation.finance.global.reconciliation.standard.critical.1_0_0",
"ruleset_ref": "ZAR:ruleset:acct_crawler_reconciliation_policy@sha256:cccc",
"category": "reconciliation",
"result": "FAIL",
"severity": "BLOCKING",
"enforcement_layer": "integrity",
"message": "Entity-level TB reconciliation diff exceeds absolute tolerance.",
"metrics": { "entity_diff_abs": 12.45, "entity_diff_rel": 0.0008, "abs_tol": 5.0, "rel_tol": 0.0001 }
},
{
"check_id": "UNKNOWN_CLASSIFICATION_RATIO",
"crid": "ruleset.classification.finance.global.capex-opex.standard.warning.1_0_0",
"ruleset_ref": "ZAR:ruleset:acct_crawler_classification@sha256:bbbb",
"category": "classification",
"result": "WARN",
"severity": "WARNING",
"message": "Unknown classification ratio exceeds preferred threshold but below hard fail threshold.",
"metrics": { "unknown_ratio": 0.012, "warn_threshold": 0.01, "fail_threshold": 0.02 }
}
],
"summary": {
"integrity_status": "FAILED",
"reporting_eligible": false,
"publish_allowed": true,
"exception_ref": null,
"counts": { "pass": 0, "warn": 1, "fail": 1 },
"failed_rule_crids": [
"ruleset.validation.finance.global.reconciliation.standard.critical.1_0_0"
],
"warnings": [
"Classification unknown ratio above preferred threshold."
]
}
}
Tie to IntegrityEvaluated event
When IntegrityEvaluated is emitted, include:
integrity_check_report_ref(ZAR ref)integrity_statusfailed_rule_cridsdataset_hashruleset_bundle_ref
This makes the event log minimal but fully traceable.
7.13.2. Canonicalization + hashing rules (so ZAR refs are stable)
To ensure deterministic content_hash:
- Canonicalize JSON before hashing:
- UTF-8
- sorted object keys recursively
- arrays preserve order
- remove insignificant whitespace
- Either:
- include generated_at in hash (strict immutability), OR
- move timestamps into a separate non-hashed envelope and hash only a hash_basis object.
Recommendation: For audit trails, keep timestamp in the artifact, but compute hash over everything except generated_at if you want stable reproduction across replays. If you want strict immutability per run, keep it included.
Given the replay goal (“same inputs, same outputs”), we'll do:
generated_atincluded in artifactcontent_hash_basisexcludes it
7.13.3. integrity_report_registry (SSSR table spec)
Purpose
A query-optimized registry of integrity outcomes produced by integrity evaluation runs. Each row represents the integrity conclusion for a specific (scope, tenant, entity, period, dataset_hash, ruleset_bundle_ref) and links to the authoritative ZAR artifacts:
integrity_check_report_ref(full check details)- optional
exception_ref(governed override) - optional
revocation_ref(revocation event/artifact)
Table
Postgres SQL DDL — integrity_report_registry
DO $$ BEGIN
CREATE TYPE integrity_mode AS ENUM ('standard', 'strict_compliance');
EXCEPTION WHEN duplicate_object THEN NULL; END $$;
DO $$ BEGIN
CREATE TYPE integrity_scope_level AS ENUM ('job','dataset','artifact');
EXCEPTION WHEN duplicate_object THEN NULL; END $$;
DO $$ BEGIN
CREATE TYPE integrity_status_enum AS ENUM (
'PENDING','IN_PROGRESS','PASSED','PASSED_WITH_WARNINGS','FAILED','PASSED_WITH_EXCEPTION','REVOKED'
);
EXCEPTION WHEN duplicate_object THEN NULL; END $$;
CREATE TABLE IF NOT EXISTS integrity_report_registry (
integrity_id TEXT PRIMARY KEY,
-- ZAR binding (if you have the ZAR ref at write-time)
artifact_ref TEXT UNIQUE NULL, -- recommended to populate; can be backfilled from zar_artifact_index
artifact_name TEXT NOT NULL,
artifact_type TEXT NOT NULL DEFAULT 'integrity_check_report',
applies_to_meid TEXT NOT NULL,
content_hash TEXT NOT NULL,
schema_ref TEXT NOT NULL,
-- scope
scope_level integrity_scope_level NOT NULL DEFAULT 'dataset',
scope_object_ref TEXT NOT NULL,
scope_hash TEXT NOT NULL,
-- context
job_id TEXT NOT NULL,
run_id TEXT NULL,
tenant_id TEXT NOT NULL,
entity_id TEXT NOT NULL,
generated_at TIMESTAMPTZ NOT NULL,
initiated_by TEXT NOT NULL DEFAULT 'system',
mode integrity_mode NOT NULL DEFAULT 'standard',
-- dataset
dataset_type TEXT NOT NULL,
dataset_hash TEXT NOT NULL,
dataset_schema_ref TEXT NOT NULL,
period_start DATE NOT NULL,
period_end DATE NOT NULL,
reporting_year INT NOT NULL,
source_systems JSONB NOT NULL DEFAULT '[]'::jsonb,
record_counts JSONB NOT NULL DEFAULT '{}'::jsonb,
-- rulesets
ruleset_bundle_ref TEXT NOT NULL,
ruleset_resolved JSONB NOT NULL DEFAULT '[]'::jsonb,
ruleset_refs JSONB NOT NULL DEFAULT '[]'::jsonb,
ruleset_crids JSONB NOT NULL DEFAULT '[]'::jsonb,
-- summary
integrity_status integrity_status_enum NOT NULL,
reporting_eligible BOOLEAN NOT NULL DEFAULT FALSE,
publish_allowed BOOLEAN NOT NULL DEFAULT FALSE,
integrity_passed BOOLEAN NOT NULL DEFAULT FALSE,
exception_ref TEXT NULL,
counts_pass INT NOT NULL DEFAULT 0,
counts_warn INT NOT NULL DEFAULT 0,
counts_fail INT NOT NULL DEFAULT 0,
failed_rule_crids JSONB NOT NULL DEFAULT '[]'::jsonb,
warnings JSONB NOT NULL DEFAULT '[]'::jsonb,
-- checks (optional denormalized cache)
check_results JSONB NOT NULL DEFAULT '[]'::jsonb,
check_ids_failed JSONB NOT NULL DEFAULT '[]'::jsonb,
check_ids_warn JSONB NOT NULL DEFAULT '[]'::jsonb,
top_failure_category TEXT NULL,
flags JSONB NOT NULL DEFAULT '[]'::jsonb,
created_at TIMESTAMPTZ NOT NULL DEFAULT now(),
updated_at TIMESTAMPTZ NOT NULL DEFAULT now()
);
-- constraints
ALTER TABLE integrity_report_registry
ADD CONSTRAINT chk_dataset_hash_format
CHECK (dataset_hash ~ '^sha256:[0-9a-f]{64}$');
ALTER TABLE integrity_report_registry
ADD CONSTRAINT chk_content_hash_format
CHECK (content_hash ~ '^sha256:[0-9a-f]{64}$');
ALTER TABLE integrity_report_registry
ADD CONSTRAINT chk_exception_ref_required
CHECK (
(integrity_status <> 'PASSED_WITH_EXCEPTION')
OR (exception_ref IS NOT NULL)
);
-- indexes
CREATE INDEX IF NOT EXISTS idx_ir_tenant_entity_year
ON integrity_report_registry (tenant_id, entity_id, reporting_year);
CREATE INDEX IF NOT EXISTS idx_ir_status_eligible
ON integrity_report_registry (integrity_status, reporting_eligible);
CREATE INDEX IF NOT EXISTS idx_ir_dataset_hash
ON integrity_report_registry (dataset_hash);
CREATE INDEX IF NOT EXISTS idx_ir_job
ON integrity_report_registry (job_id);
CREATE INDEX IF NOT EXISTS idx_ir_ruleset_bundle
ON integrity_report_registry (ruleset_bundle_ref);
CREATE INDEX IF NOT EXISTS gin_ir_failed_rule_crids
ON integrity_report_registry USING GIN (failed_rule_crids);
CREATE INDEX IF NOT EXISTS gin_ir_ruleset_refs
ON integrity_report_registry USING GIN (ruleset_refs);
CREATE INDEX IF NOT EXISTS gin_ir_flags
ON integrity_report_registry USING GIN (flags);
CREATE INDEX idx_integrity_scope_hash
ON integrity_report_registry(scope_hash);
Note: the “revocation_ref required on REVOKED” constraint is strict. If we want to allow REVOKED without an explicit artifact, remove that constraint and treat it as recommended.
7.13.3. Projection mapping — integrity_check_report → integrity_report_registry
This assumes the integrity_check_report JSON schema includes these (or similar) top-level structures:
- meta (
job_id,tenant_id,entity_id,meid,timestamps) - scope (level + object ref)
- dataset (
type,hash,schema_ref,period) - rulesets (bundle ref + array refs + mode)
- summary (
integrity_status,reporting_eligible,publish_allowed,counts) - checks (array of check objects with
result,crid,enforcement_mode,category,flags)
If the actual report differs, keep the same mapping approach—just adjust paths.
Deterministic derivations (normative)
A) integrity_id (deterministic)
Recommended formula (stable, unique enough, replay-safe):
integrity_id =
"INT-" + tenant_id + "-" + entity_id + "-" + scope_level + "-" +
reporting_year + "-" + period_label + "-" +
substring(dataset_hash, 8, 12) + "-" +
substring(sha256(ruleset_bundle_ref), 1, 8) + "-" +
mode
- Use
dataset_hash+ruleset_bundle_ref+ period as anchor inputs. - It is also possible to store
supersedes_integrity_idif lineage chains is needed/wanted.
B) integrity_passed (canonical)
integrity_passed = integrity_status IN (
'PASSED', 'PASSED_WITH_WARNINGS', 'PASSED_WITH_EXCEPTION'
)
C) reporting_eligible and publish_allowed (canonical)
From the governance model:
| integrity_status | reporting_eligible | publish_allowed |
|---|---|---|
PASSED | true | true |
PASSED_WITH_WARNINGS | true | true |
PASSED_WITH_EXCEPTION | true | true |
FAILED | false | true |
REVOKED | false | false |
PENDING/IN_PROGRESS | false | false |
So:
- If report contains these booleans in summary, trust them.
- Otherwise compute them deterministically from
integrity_status.
Field-by-field mapping table
| Registry Column | Source in integrity_check_report | Rule |
|---|---|---|
artifact_type | zar.artifact_type | Must equal integrity_check_report. |
artifact_name | zar.artifact_name | Required. |
applies_to_meid | zar.applies_to_meid | Required. |
content_hash | zar.content_hash | Required, format sha256:<hash>. |
schema_ref | zar.schema_ref | Required. |
scope_level | scope.level | Must be one of job / dataset / artifact. |
scope_object_ref | scope.object_ref | Preferred: ZAR ref for the scoped object (e.g., ZAR:dataset:...@sha256:...). Required for deterministic identity. |
job_id | context.job_id | Required. |
run_id | context.run_id | Optional. |
tenant_id | context.tenant_id | Required. |
entity_id | context.entity_id | Required. |
generated_at | ± | Required. |
initiated_by | context.initiated_by | Optional; default system. |
mode | context.mode | Default standard. |
dataset_type | dataset.dataset_type | Required. |
dataset_hash | dataset.dataset_hash | Required, format sha256:<hash>. |
dataset_schema_ref | dataset.schema_ref | Required. |
period_start | dataset.period.start | Required. |
period_end | dataset.period.end | Required. |
reporting_year | dataset.period.reporting_year | Required. |
source_systems | dataset.source_systems | Optional array. |
record_counts | dataset.record_counts | Optional object. |
ruleset_bundle_ref | rulesets.bundle_ref | Required. |
ruleset_resolved | rulesets.resolved | Full resolved ruleset list (good to store). |
ruleset_refs | derived from rulesets.resolved[].ruleset_ref | Normalize + unique. |
ruleset_crids | derived from rulesets.resolved[].crid | Normalize + unique. |
integrity_status | summary.integrity_status | Required enum. |
reporting_eligible | summary.reporting_eligible | If missing, compute from status mapping. |
publish_allowed | summary.publish_allowed | If missing, compute from status mapping. |
integrity_passed | derived from summary.integrity_status | true if status in {PASSED, PASSED_WITH_WARNINGS, PASSED_WITH_EXCEPTION}. |
exception_ref | summary.exception_ref | Nullable. Can later be updated via IntegrityExceptionApproved. |
counts_pass | summary.counts.pass | If missing, count checks where result=PASS. |
counts_warn | summary.counts.warn | If missing, count checks where result=WARN. |
counts_fail | summary.counts.fail | If missing, count checks where result=FAIL. |
failed_rule_crids | summary.failed_rule_crids OR derive from checks[] | Prefer summary; else all FAIL checks[].crid. |
warning_rule_crids | derive from checks[] | All WARN checks[].crid (this is currently not stored this in summary). |
warnings | summary.warnings | Optional list of warning strings. |
top_failure_category | derive from checks[] | First FAIL’s category (or null). |
check_results | checks | Store full checks[] if you want fast UI. |
check_ids_failed | derive from checks[] | FAIL checks[].check_id. |
check_ids_warn | derive from checks[] | WARN checks[].check_id. |
flags | derive (future) | If you later add checks[].flags or summary.flags, normalize + unique. |
artifact_ref | known at persistence OR from ZAR index | Not present in JSON; comes from ZAR when stored (recommended). |
integrity_id | derived | Deterministic ID based on (tenant_id, entity_id, mode, dataset_hash, period) (see below). |
revocation_ref | from revocation event | Not present in report; updated on IntegrityRevoked. |
exception_status | derived from exception artifact | Not present in report; projection field from exception lifecycle. |
created_by | context.initiated_by | Your report uses initiated_by not created_by; map it here. |
supersedes_integrity_id | optional (registry-only) | Not in report; used when we explicitly supersede an old record. |
notes | optional (registry-only) | Not in report. |
Exact mapping from the JSON → table columns
Given the payload:
Direct mappings
artifact_name←zar.artifact_nameapplies_to_meid←zar.applies_to_meidcontent_hash←zar.content_hashschema_ref←zar.schema_refjob_id←context.job_idrun_id←context.run_idtenant_id←context.tenant_identity_id←context.entity_idgenerated_at←context.generated_atinitiated_by←context.initiated_bymode←context.modedataset_type←dataset.dataset_typedataset_hash←dataset.dataset_hashdataset_schema_ref←dataset.schema_refperiod_start←dataset.period.startperiod_end←dataset.period.endreporting_year←dataset.period.reporting_yearsource_systems←dataset.source_systemsrecord_counts←dataset.record_countsruleset_bundle_ref←rulesets.bundle_refruleset_resolved←rulesets.resolvedintegrity_status←summary.integrity_statusreporting_eligible←summary.reporting_eligiblepublish_allowed←summary.publish_allowedexception_ref←summary.exception_refcounts_pass←summary.counts.passcounts_warn←summary.counts.warncounts_fail←summary.counts.failfailed_rule_crids←summary.failed_rule_cridswarnings←summary.warningscheck_results←checks(full array)
Derived mappings (deterministic)
artifact_type← constantintegrity_check_reportscope_level← default dataset (until we explicitly add scope to the report schema)scope_object_ref←dataset.dataset_hash(or dataset artifact ref if you have it)ruleset_refs←rulesets.resolved[].ruleset_refruleset_crids←rulesets.resolved[].cridcheck_ids_failed←checks[].check_idwhereresult == "FAIL"check_ids_warn←checks[].check_idwhereresult == "WARN"top_failure_category← first check whereresult == "FAIL"then category else nullintegrity_passed←integrity_statusIN ("PASSED","PASSED_WITH_WARNINGS","PASSED_WITH_EXCEPTION")flags←empty []for now (unless/ until we add flags per check or summary)
integrity_id formula
Use a deterministic stable ID:
integrity_id =
"INT-" + tenant_id + "-" + entity_id + "-" +
scope_level + "-" +
slug(scope_object_ref) + "-" +
mode
Where slug(scope_object_ref) is a deterministic shortening function, e.g.:
- extract the sha256 tail if present and take first 12 chars
(
...@sha256:abcdef...→ abcdef123456) - else fallback to md5(
scope_object_ref) first 12 chars
Practical example
If:
tenant_id = TENANT-ACMEentity_id = ENTITY-ACME-DEscope.level = datasetscope.object_ref = ZAR:dataset:canonical_trial_balance:ENTITY-ACME-DE:2026M01@sha256:2222...mode = standard
Then:
INT-TENANT-ACME-ENTITY-ACME-DE-dataset-222222222222-standard
This is:
- stable across replays
- changes if
dataset_hashorbundle_refchanges (correct) - human-readable enough to debug
Update semantics (what changes a row)
Insert
On IntegrityEvaluated:
- insert new row (preferred), OR upsert by unique anchor
Update
On IntegrityExceptionApproved:
set exception_ref, exception_status='active'set integrity_status='PASSED_WITH_EXCEPTION'set reporting_eligible=true, publish_allowed=true, integrity_passed=truebump updated_at
On IntegrityRevoked:
set integrity_status='REVOKED'set reporting_eligible=false, publish_allowed=false, integrity_passed=falseset revocation_refbump updated_at
Implementation hint for the team (very practical)
Treat integrity_report_registry as a projection written by an event consumer that listens to:
IntegrityEvaluatedIntegrityExceptionApprovedIntegrityRevoked
This keeps it consistent with ZSSR and avoids ad hoc writes.
8. Rule Lifecycle Management (CMCB Integration)
ZRR integrates directly with the Change Management & Compatibility (CMCB) Framework.
Change Types
| Change Type | Description |
|---|---|
PATCH | Metadata adjustment (non-breaking) |
MINOR | Threshold adjustment |
MAJOR | Structural logic change |
DEPRECATION | Rule replacement required |
MIGRATION | Automatic remapping to new CRID |
MAJOR changes require:
- Deprecation window
- Canary execution simulation
- ZSSR impact classification
- AI risk re-evaluation (if applicable)
- Governance sign-off
All lifecycle changes must be logged in:
rule_change_log- DAL
- AI Governance Register (if high risk)
9. Formal CRID semantic versioning policy (PATCH/MINOR/MAJOR)
9.1. CRID Version Format
CRID ends with X_Y_Z where:
- X = MAJOR
- Y = MINOR
- Z = PATCH
Example:
ruleset.validation.finance.global.reconciliation.standard.critical.1_2_3
Definitions
- CRID version = governance-visible semantic version of the ruleset’s behavior
- ZAR hash = execution-identity of the exact ruleset content
- A given CRID version may map to multiple historical hashes (draft iterations), but only one approved + active hash per scope (global/tenant/entity) at a time.
9.2. Change Classifications
PATCH (Z increment)
Intent: Non-behavioral or behavior-neutral changes. Compatibility: Must remain fully backward compatible in output meaning.
Typical PATCH changes:
- Metadata-only changes:
owners,approved_by,created_at,changelogtext edits (if not altering meaning)
- Documentation / comments updates
- Reordering rules without changing precedence or results
- Tightening validations that only improve error messaging (no new rejects)
- Expanding explicit allowlists in a way that does not change classification of existing inputs (rare; see below)
PATCH must NOT:
- Change any computed numeric result given identical inputs
- Change pass/warn/fail outcomes for previously valid inputs
- Change precedence rules
- Change default assumptions used in computations
Required tests for PATCH:
- Schema validation
- Static lint checks
- “Golden replay” on canonical sample set: outputs must be identical byte-for-byte (except metadata timestamps)
MINOR (Y increment)
Intent: Backward-compatible behavior extension. Compatibility: Existing consumers remain valid; outputs may expand but not break contracts.
Typical MINOR changes:
- Adding new optional rule branches that apply only when new fields exist
- Expanding mappings to cover new accounts/dimensions without reclassifying existing mapped values
- Introducing new flags/warnings (non-blocking) for edge cases
- Adding new optional output fields (must be additive and documented)
- Increasing granularity in rollups (new grouping keys allowed) while keeping defaults unchanged
- New thresholds that only affect previously “unknown/unclassified” items (doesn’t change already-classified items)
MINOR must NOT:
- Change meaning of existing output fields
- Change default calculation assumptions (unless gated behind an explicit input param defaulted off)
- Change reject/blocking behavior for requests that previously passed validation
Required tests for MINOR:
- Schema validation + contract tests
- Golden replay:
- For baseline input set: existing outputs unchanged
- New scenarios: new behavior demonstrated
- Compatibility check:
min_schema_ref/max_schema_refmust remain compatible
MAJOR (X increment)
Intent: Breaking behavioral change. Compatibility: May change outputs, validations, or semantics; requires controlled migration.
Typical MAJOR changes:
- Any change that can alter numeric outputs for the same inputs
- e.g., PV discounting default changes (t0 vs spread)
- abatement cost formula changes
- Precedence changes
- tag detection precedence order changes
- classification precedence changes (attribute vs account)
- Threshold changes that flip pass/warn/fail outcomes
- Changing default inclusion rules (e.g., include R=0 projects in ratio rollups)
- Changing field semantics or units
- Removing fields or renaming outputs
- Changing interpretation of “reconciliation passed_with_warnings” behavior
- Any change requiring downstream migration work or re-baselining
Required tests for MAJOR:
- Schema validation + full contract suite
- Golden replay with documented diffs (expected deltas)
- Canary execution on representative tenant extracts (if available)
- Governance sign-off (CMCB approval)
- Migration notes required in changelog with:
- “what changed”
- “why”
- “what breaks”
- “how to migrate”
- “recommended rollout plan”
9.3. Decision Matrix
| Change | Patch | Minor | Major |
|---|---|---|---|
| Metadata changes only | ✅ | ||
| Doc/comment changes | ✅ | ||
| Add new optional output field | ✅ | ||
| Add new flag/warning only | ✅ | ||
| New mapping for previously UNKNOWN items only | ✅ | ||
| Change precedence order | ✅ | ||
| Change tolerance thresholds affecting pass/fail | ✅ | ||
| Change formula or default assumption | ✅ | ||
| Remove/rename output fields | ✅ | ||
| Tighten validation to reject previously accepted inputs |