ZAR
ZAR - ZAYAZ Registry
1. Purpose and Scope
The ZAYAZ Artifact Registry (ZAR) is the single system-of-record for all governed artifacts used by the ZAYAZ platform, including:
- executable engines (calculators, validators, routers, orchestrators),
- declarative contracts (schemas, manifests, rule sets, taxonomies),
- supporting structural artifacts (connectors, models, documentation).
ZAR guarantees that:
- every artifact is uniquely identified, versioned, and traceable;
- every reported metric, signal, or decision can be traced to:
- the exact executable that produced it, and
- the exact contract that constrained it;
- audit, assurance, and regulatory review can deterministically reproduce outcomes.
To achieve this, ZAR defines and manages Canonical Managed Identifiers (CMIs), their compact runtime aliases, and their human-readable addresses.
ZAR Artifacts use a dual-identity model
Every ZAR artifact has:
- A canonical opaque identifier (e.g. TG3K7) Used for lineage, audit, federation, and cryptographic integrity.
- A human-readable ZAR address (e.g. ZAR:SCHEMA:GHG.common.options:v1) Used for configuration, documentation, and registry lookup.
ZAR addresses are dot-separated, human-readable artifact locators that encode kind, domain, logical namespace, role, and concrete version. They are aliases, not identities.
ZAR addresses are resolved via the ZAR registry to their canonical artifact identifiers at runtime.
This mirrors:
- DNS name → IP
- Git tag → commit hash
- OCI image tag → digest
2. Unified ZAR Registry Table (zar_registry)
ZAR uses one unified registry table for both executable and non-executable artifacts. The artifact’s nature is distinguished by its kind.
Core Fields
| Field | Description |
|---|---|
| cmi (PK) | Canonical Managed Identifier. Stable, semantic identity of the artifact. |
| cmi_short_code (unique) | Compact, sequence-based runtime identifier used in lineage stamps (e.g. 000042). |
| zar_code (unique) | Compact alias (8 chars Base32 Crockford) used in audit logs and federation. |
| zar_address (unique, nullable) | Human-readable artifact address (alias), e.g. schema.compute.GHG.common.options.1_0_0. |
| domain | Top-level system domain (e.g. MICE, COMPUTE, SIGNAL). |
| component | Module or artifact family. |
| kind | Artifact kind (ENGINE, SCHEMA, MANIFEST, RULESET, TAXONOMY, DOC, etc.). |
| name | Artifact name. |
| version | Concrete semantic version (SemVer, e.g. 1_0_0). |
| git_sha | Source commit hash at build time. |
| build_hash | Hash of the built artifact content. |
| build_timestamp | Time the artifact was registered. |
| execution_ref | How the artifact is executed (container image, class path, function ARN). Nullable. |
| storage_uri | Where the artifact content is stored (S3, CloudFront, etc.). Nullable. |
| content_type | MIME type of stored content (e.g. application/schema+json). |
| status | Lifecycle state (active, deprecated, archived, draft). |
| owner_team | Responsible team. |
| description | Functional description. |
| capabilities | Declared capabilities (JSON). |
| dependencies | List of dependent CMIs (JSON). |
| repo_url | Source repository URL. |
| ci_pipeline_id | CI/CD job reference. |
| last_verified_at | Last audit or verification timestamp. |
Example 1 — ENGINE artifact (GHG absolute calculator)
| Field | Example value |
|---|---|
| cmi | MICE.InvoiceEmissions.ENGINE.AbsCalculator.1_0_0 |
| cmi_short_code | 000237 |
| zar_code | 7K3Q9D2P |
| zar_address | engine.compute.ghg.abs_calculator.1_0_0 |
| domain | MICE |
| component | InvoiceEmissions |
| kind | ENGINE |
| name | AbsCalculator |
| version | 1_0_0 |
| git_sha | 7d3c0a9b6f1e4c2d8b0f9f3a1c2e7a9d0b3c4e5f |
| build_hash | sha256:3c7a...9f12 |
| build_timestamp | 2026-01-27T10:12:03Z |
| execution_ref | oci://ghcr.io/viroway/mice-ghg-abs-calculator:1.0.0 |
| storage_uri | (null) |
| content_type | (null) |
| status | active |
| owner_team | MICE-Core |
| description | Computes absolute GHG emissions aggregation across scopes per compute_method_registry. |
| capabilities | ["CALCULATION","GHG","AGGREGATION"] |
| dependencies | ["DAVE.TrustGate.ENGINE.Core.1_0_0"] |
| repo_url | github.com/viroway/zayaz-mice-engines |
| ci_pipeline_id | gh-actions:9821736 |
| last_verified_at | 2026-01-27T10:30:00Z |
A partner running the same artifact identity (cmi) will also compute the same zar_code = 7K3Q9D2P (even though their cmi_short_code won’t match).
Example 2 — SCHEMA artifact (GHG common options schema)
| Field | Example value |
|---|---|
| cmi | MICE.InvoiceEmissions.SCHEMA.CommonOptions.1_0_0 |
| cmi_short_code | 000041 |
| zar_code | TG3K7M2A |
| zar_address | schema.compute.ghg.common.options.1_0_0 |
| domain | MICE |
| component | InvoiceEmissions |
| kind | SCHEMA |
| name | CommonOptions |
| version | 1_0_0 |
| git_sha | a21b9c7d8e0f1a2b3c4d5e6f7a8b9c0d1e2f3a4b |
| build_hash | sha256:b19e...2aa7 |
| build_timestamp | 2026-01-27T09:55:11Z |
| execution_ref | (null) |
| storage_uri | s3://zar-artifacts/schemas/compute/ghg/common/options/1.0.0/GHG_common_options.json |
| content_type | application/schema+json |
| status | active |
| owner_team | Compute-Contracts |
| description | Shared compute options schema for GHG methods (jurisdiction, sector, boundary hints, etc.). |
| capabilities | ["SCHEMA","OPTIONS_CONTRACT"] |
| dependencies | [] |
| repo_url | github.com/viroway/zayaz-contracts |
| ci_pipeline_id | gh-actions:9821601 |
| last_verified_at | 2026-01-27T10:02:00Z |
This schema’s zar_code is the same in every environment that registers the same cmi.
For more information see 9. Federation-Safe Identifier Policy.
Design Principle
CMI is the canonical identity. Short codes and addresses are aliases.
3. Alias and Capability Handling
3.1. Aliases
For v1, aliases are handled directly in the main table:
cmi→ canonical identitycmi_short_code→ runtime lineage aliaszar_code→ compact audit/federation aliaszar_address→ human-readable reference
No separate alias table is required unless multiple aliases per artifact become necessary in the future (e.g. legacy or partner mappings).
3.2. Capabilities
Capabilities describe what an artifact can do, not what it is.
They are stored as JSONB in the main table, for example:
["VALIDATION","SCHEMA_ENFORCEMENT","SCENARIO_ALIGNMENT"]
This allows:
- fast lookup via JSONB + GIN indexing,
- flexible extension without schema churn,
- sufficient performance for registry-scale queries.
A separate capability join table is not required for v1.
4. Kind-Based Required Field Rules
ZAR enforces minimal integrity constraints based on kind.
Executable Artifacts (kind = ENGINE)
- Required
execution_refbuild_hashversion
- Optional
storage_uri(only if auxiliary content exists)
Declarative Artifacts (SCHEMA, MANIFEST, RULESET, TAXONOMY, DOC)
- Required
storage_uricontent_typebuild_hashversion
- Must NOT require
execution_ref
This distinction is critical:
Engines execute. Schemas constrain. Both are governed artifacts.
5. Address vs Code vs CMI (Resolution Logic)
To eliminate ambiguity, ZAR uses three distinct identifier layers, each with a specific role.
5.1. Canonical Managed Identifier (CMI)
- Example: COMPUTE.GHG.SCHEMA.CommonOptions.1_0_0
- Purpose:
- primary key
- governance identity
- audit reproducibility
- Never reused or reassigned.
5.2. Compact Aliases
cmi_short_code(e.g. 000042)- used in runtime lineage (USO tails, telemetry)
- optimized for size and speed
zar_code(e.g. TG3K7)- used in audit logs, signatures, federation
5.3. ZAR-Native Rule Artifact Pattern
5.3.1.
ZAR artifacts typically follow:
[layer].[domain].[subdomain].[object].[context].X_Y_Z
The first segment defines artifact type:
| Layer | Meaning |
|---|---|
| schema | schema artifact |
| signal | signal artifact |
| engine | engine artifact |
| agent | assistant artifact |
| ruleset | assistant artifact |
For e.g. "Ruleset Types" the second segment alternatives are:
| Value | Meaning |
|---|---|
| validation | structural/data integrity rule |
| compute | computational logic |
| transform | conversion logic |
| aggregate | summarization logic |
| governance | policy enforcement |
| risk | risk categorization |
| ai | AI-assisted inference |
5.3.2. ZAR Address (Human Alias)
- Example:
schema.compute.GHG.common.options.1_0_0
- Purpose:
- configuration
- documentation
- developer ergonomics
- Not an identity; resolves to a registry row.
5.4. Resolution Flow
zar_address
→ registry lookup
→ cmi
→ execution_ref OR storage_uri
→ build_hash, version, status
At runtime:
- only
cmi_short_codeis required in lineage. - resolution to full metadata happens post hoc via the registry.
This is analogous to:
- image tag → image digest,
- DNS name → IP address,
- Git tag → commit hash.
6. Versioning and Immutability
- All released artifacts use full semantic versions (
MAJOR.MINOR.PATCH, e.g. 1_0_0). - Artifact content is immutable once registered.
- Patch changes increment PATCH.
zar_addressmay have alias lanes (e.g. v1) that resolve to the latest compatible concrete version.- Audit and replay always resolve to a concrete version, never an alias.
7. Relationship to ZSSR and Lineage
The ZAYAZ Smart System Router (ZSSR) uses:
- CSI (signal type),
- producer
cmi_short_code, - contract CMIs (schemas),
- trust and regulatory context,
to route signals and computations deterministically.
Every output can be traced back to:
- which engine ran,
- which contract constrained it,
- which exact build produced it.
8. Summary (Normative)
- ZAR uses one unified registry for all artifacts.
- CMI is the canonical identity.
- Short codes are for runtime lineage.
- Addresses are for humans.
- Executable and declarative artifacts are first-class, but distinct by kind.
- Nothing moves, nothing overwrites, everything is traceable.
ZAR is not just a registry. It is the memory of the system.
9. Federation-Safe Identifier Policy (ZAR v1)
This policy ensures identifiers remain portable across environments and partners, while still supporting compact runtime lineage and audit-grade reproducibility.
9.1. zar_code generation
Goal: Same artifact identity ⇒ same zar_code in every environment.
Inputs
- cmi (canonical managed identifier, PK in zar_registry)
Canonical string
- CANON = "cmi:" +
<CMI>Example: cmi:MICE.InvoiceEmissions.SCHEMA.CommonOptions.1_0_0
Hash
- H = SHA-256( UTF8(CANON) )
Encode
- zar_code = Base32Crockford( H )[0:8]
- Base32 Crockford alphabet: 0123456789ABCDEFGHJKMNPQRSTVWXYZ
- Output is 8 characters, uppercase, no separators.
Normative rule
- zar_code MUST be generated in CI/CD from the cmi using the above steps.
- zar_code MUST NOT be hand-edited.
9.2. Required federation export bundle contents
When exchanging provenance, audit evidence, or running federated verification, systems MUST export an artifact bundle containing:
- Registry snapshot (minimal)
A machine-readable list of all artifacts referenced by the exported USO chains and/or report computations, including at least:
- cmi
- zar_code
- zar_address (if present)
- kind
- version
- build_hash
- git_sha (if available)
- execution_ref (ENGINE only)
- storage_uri + content_type (non-ENGINE artifacts)
- status
- dependencies
- Integrity proofs
- For each non-ENGINE artifact (schemas, manifests, rulesets, taxonomies):
- the artifact bytes OR a resolvable storage_uri
- build_hash (sha256 digest) matching the bytes
- For each ENGINE artifact:
- execution_ref must resolve to an immutable executable reference (preferred: OCI digest form)
- e.g. oci://…@sha256:
<digest> - plus build_hash to bind the build identity to the registry record
- Resolution rules
- A small metadata file stating:
- zar_code generation policy version (this document version)
- hash and encoding rules used
- any permitted alias behavior (e.g., zar_address lanes such as v1)
Why: Partners/verifiers must be able to validate “what ran / what constrained” without needing access to your internal DB sequences.
9.3. Collision handling rules (8-char Base32)
Although collisions are unlikely, ZAR MUST handle them deterministically.
Detection
- On registry insertion, if generated zar_code already exists for a different cmi, this is a collision.
Resolution
- Recompute using longer prefix:
- zar_code = Base32Crockford(H)[0:10]
- If still colliding (extremely unlikely), extend again:
- 12, then 16 characters
Normative rule
- Length extension MUST preserve the original hash H and encoding, only increasing prefix length.
- Once an artifact has a zar_code, it MUST NEVER change.
Operational recommendation
- Default to 8 chars; enable automatic extension in CI and registry tooling.
9.4. Which IDs are allowed in ZSSR rulesets (and why)
ZSSR rulesets define routing decisions. In a federation model, routing identifiers must be:
- compact,
- stable across partner environments,
- non-leaking of internal naming where possible,
- resolvable via the ZAR registry snapshot.
Allowed identifiers in ZSSR rulesets
- ✅ zar_code (preferred)
- Example: "next_zar_code": "TG3K7M2A"
- Why: portable, compact, federation-safe, resolves via registry snapshot.
- ✅ cmi (allowed, internal/debug)
- Example: "next_cmi": "DAVE.TrustGate.ENGINE.Core.1_0_0"
- Why: unambiguous and deterministic, but longer and may expose internal structure.
- ⚠️ cmi_short_code (NOT allowed for federation routing)
- Why: sequence-based and environment-local, guaranteed to differ across partners.
- ✅ zar_address (allowed for configuration/docs, not recommended for routing)
- Why: human-readable alias; may include alias lanes (e.g., v1) that require resolution and can introduce ambiguity if not pinned to a concrete version.
Normative rule
- Federated routing MUST use zar_code for next_* fields.
- cmi_short_code MUST be used only in runtime lineage tails inside a single environment.
Summary (implementation checklist)
- Generate zar_code in CI from cmi using SHA-256 + Base32 Crockford prefix.
- Export bundles must include registry snapshot + integrity proofs.
- Detect collisions and extend code length deterministically.
- ZSSR rulesets route by zar_code (portable), never by cmi_short_code.
10. ZAR Identifier Generation
ZSSR
# ZAR Identifier Generation (Language-Agnostic Pseudocode) — v1
# Purpose: Deterministic, federation-safe IDs derived from CMI.
# Notes:
# - All hashing is SHA-256 over UTF-8 bytes.
# - Base32 uses Crockford alphabet: 0123456789ABCDEFGHJKMNPQRSTVWXYZ
# - All outputs MUST be uppercase.
ALPHABET_CROCKFORD = "0123456789ABCDEFGHJKMNPQRSTVWXYZ"
function UTF8_BYTES(s: string) -> bytes
# returns UTF-8 encoded byte array
...
function SHA256(data: bytes) -> bytes
# returns 32-byte digest
...
function HEX_LOWER(data: bytes) -> string
# returns lowercase hex string of bytes, length = 2*len(data)
...
function BASE32_CROCKFORD(data: bytes) -> string
# encodes bytes into Base32 using Crockford alphabet, no padding
bitBuffer = 0
bitCount = 0
out = ""
for each byte in data:
bitBuffer = (bitBuffer << 8) OR byte
bitCount += 8
while bitCount >= 5:
bitCount -= 5
index = (bitBuffer >> bitCount) AND 31
out += ALPHABET_CROCKFORD[index]
if bitCount > 0:
index = (bitBuffer << (5 - bitCount)) AND 31
out += ALPHABET_CROCKFORD[index]
return out
function NORMALIZE_CMI(cmi: string) -> string
# MUST be consistent across systems; at minimum:
# - trim leading/trailing whitespace
# - remove CR/LF
# - replace NBSP with space
cmi = replace(cmi, NBSP, " ")
cmi = replace(cmi, "\r", "")
cmi = replace(cmi, "\n", "")
cmi = trim(cmi)
return cmi
# --- Draft placeholders (Excel/local convenience; CI replaces with real values later) ---
function BUILD_HASH_PLACEHOLDER(cmi: string) -> string
cmiN = NORMALIZE_CMI(cmi)
canon = "build:" + cmiN
digest = SHA256(UTF8_BYTES(canon))
return "sha256:" + HEX_LOWER(digest)
function GIT_SHA_PLACEHOLDER(cmi: string) -> string
cmiN = NORMALIZE_CMI(cmi)
canon = "git:" + cmiN
digest = SHA256(UTF8_BYTES(canon))
# Git SHA-like placeholder: first 40 hex chars
return substring(HEX_LOWER(digest), 0, 40)
# --- Federation-safe portable alias (authoritative; MUST match across environments) ---
function ZAR_CODE_FROM_CMI(cmi: string, length: int = 8) -> string
cmiN = NORMALIZE_CMI(cmi)
canon = "cmi:" + cmiN
digest = SHA256(UTF8_BYTES(canon))
b32 = BASE32_CROCKFORD(digest) # no padding
zarCode = substring(b32, 0, length) # default 8 chars
return zarCode
# --- Collision policy (registry insertion) ---
# If generated zar_code already exists for a different CMI:
# increase length: 8 -> 10 -> 12 -> 16 (same digest, longer prefix)
# once assigned, zar_code MUST NEVER change for that artifact.