CIA
Canonical Identifier Architecture
The Canonical Identifier Architecture (CIA) defines the unified identity system used across the ZAYAZ platform to ensure traceability, interoperability, and auditability of all ESG data, computations, and artifacts.
It establishes a deterministic and immutable naming framework that connects:
- What a signal represents (semantic identity)
- Which component produced or processed it (artifact identity)
- Which specific instance is being observed (runtime identity)
Together, these identifiers form the ZAYAZ Identifier Trinity, enabling full lifecycle traceability from raw input to verified disclosure.
1. The Identifier Trinity
ZAYAZ distinguishes between three distinct but interlinked identity layers:
| Layer | Identifier | Registry | Purpose |
|---|---|---|---|
| Instance | USO ID | USO (runtime) | Identifies a unique occurrence of a signal |
| Type | CSI (Canonical Signal Identifier) | SSSR | Defines the semantic type of the signal |
| Artifact | CMI (Canonical Managed Identifier) | ZAR | Identifies the component (engine/module) that produced or processed it |
Key Principle
Only the USO ID is created at runtime. CSI and CMI are pre-defined, immutable identifiers reused across all instances.
This separation ensures:
- deterministic lineage
- replayable audit trails
- strict separation of data, semantics, and execution
2. Canonical Signal Identifier (CSI)
The CSI defines the semantic identity of a signal. It is stored and governed within the SSSR (Smart Searchable Signal Registry) and assigned at the column or schema level.
Purpose
- Define what the signal is
- Provide stable semantic meaning across the platform
- Enable discoverability, mapping, and regulatory alignment
NOTE: module_code is the primary routing namespace
This enables:
- ZSSR routing
- scaling across clusters
- federation (EGFS)
- permission scoping
- billing segmentation
Example:
comp.* → Computation cluster
vera.* → Verification pipelines
netz.* → Climate modeling services
CSI Format Specification
<module_code>.<COMPONENT_ID>.<KIND>.<NAME>.v<MAJOR>_<MINOR>
Segment Definitions
| Segment | Description |
|---|---|
| MODULE_CODE | Top-level ZAYAZ module (e.g. comp, vera, inpt, netz, risk) |
| COMPONENT_ID | Unique frontmatter ID of the module/component (e.g. PEF-ME, ZAR-FW, TG-CORE) |
| KIND | Role of the signal (INPUT, OUTPUT, SIGNAL, METRIC, FEATURE, SCHEMA, CONFIG, EVENT, VIEW) |
| NAME | Canonical semantic identifier (uppercase, underscore-separated) |
| VERSION | Semantic version of the signal definition (major_minor), prefixed with "v" |
Examples
comp.PEF-ME.OUTPUT.CO2E.v1_0
vera.TG-CORE.OUTPUT.TRUST_SCORE.v1_0
inpt.FOGE-FORM.INPUT.WATER_USE.v1_0
netz.DECARB-MODEL.METRIC.ABATEMENT_COST.v2_0
risk.RIF-ENGINE.EVENT.RISK_ALERT.v1_1
Design Rules
- CSI is immutable once published
- Semantic changes require a major version increment
- Minor metadata changes increment minor version only
- COMPONENT_ID must match the frontmatter ID in the ZAYAZ manual
- All segments must be machine-parseable and globally unique in combination
Important Constraint
CSI does not have its own registry. It is defined and stored within the SSSR signal registry, where each signal field is bound to a specific CSI.
3. Canonical Managed Identifier (CMI)
The CMI defines the identity of a code artifact (engine, schema, model, ruleset, etc.) and is governed within the ZAR (Artifact Registry).
Purpose
- Identify who produced or processed the signal
- Enable reproducibility of computations
- Support lineage tracking and auditability
CMI Format Specification
<module_code>.<COMPONENT_ID>.<CMI_KIND>.<ARTIFACT_NAME>.<MAJOR>_<MINOR>_<PATCH>
Segment Definitions
| Segment | Description |
|---|---|
| module | Same as CSI |
| COMPONENT_ID | Same frontmatter ID as CSI (shared reference point) |
| KIND | Artifact category (ENGINE, SCHEMA, MODEL, RULESET, CONNECTOR, UI, JOB, LIB, TEST) |
| NAME | Artifact or sub-function name |
| VERSION | Semantic version (major_minor_patch) or date-based version |
Examples
comp.PEF-ME.ENGINE.CORE.1_1_0
comp.TG-CORE.ENGINE.VALIDATOR.1_0_0
siss.ROUTER.RULESET.INVOICE_LINES.2025_10_01
inpt.FOGE.UI.FORM.TRUST_REVIEW.1_2_0
ZAR Code (Short Identifier)
Each CMI is assigned a ZAR Code:
- Base32 short code (4–6 characters)
- Derived deterministically (e.g. from SHA-256)
Example:
TG3K7, MIE12, DSA9Q
Purpose:
- Efficient lineage tracking
- Compact routing (ZSSR)
- Provenance chain representation
4. Universal Signal Ontology (USO) Identifier
The USO ID represents a specific runtime instance of a signal.
Purpose
- Provide instance-level traceability
- Enable full replay of ESG data flows
- Anchor provenance chains
Characteristics
- Generated at runtime (ULID / UUIDv7)
- Globally unique
- Immutable
- Linked to:
- CSI (signal type)
- CMI (producing artifact)
- origin_chain (processing history)
Example
uso_id = 01JBF0W8S9Q0R1S2T3U4V5W6X
Core Fields
| Field | Description |
|---|---|
| uso_id | Unique instance identifier |
| csi | Signal type (from SSSR) |
| primary_origin_cmi | Producing artifact (from ZAR) |
| origin_chain | Ordered list of CMIs |
| origin_chain_codes | ZAR short codes |
| born_at | Timestamp |
| context.eco_numbers | Optional entity references |
5. Identifier Relationships
The three identifiers work together as follows:
USO (instance)
↓
CSI (what it is)
↓
CMI (who produced it)
Illustrative Example
| Layer | Identifier | Example | Description |
|---|---|---|---|
| Artifact (who) | CMI | COMP.PEF-ME.ENGINE.CORE.1_1_0 | Engine producing the signal |
| Signal Type (what) | CSI | COMP.PEF-ME.OUTPUT.CO2E.v1_0 | Canonical signal definition |
| Instance (which) | USO ID | 01JBF0W8S9Q0R1S2T3U4V5W6X | Unique lineage instance |
| Routing | ZAR Code | MIE12 | Compact artifact reference |
| Provenance | origin_chain_codes | [MIE12, TG3K7] | Processing trail |
6. Canonical Signal Creation Flow
When a signal is generated, the ZAYAZ platform performs the following sequence:
- A module completes a computation or ingestion process
- A new USO ID is created
- The producing component’s CMI is assigned (from ZAR)
- The corresponding CSI is assigned (from SSSR)
- The origin_chain is initialized with the producing CMI
Result
Every signal instance is fully described by:
- USO ID → instance identity
- CSI → semantic meaning
- CMI → execution provenance
7. Design Principles
- Immutability of Identifiers
- CSI, CMI, and USO IDs are never altered once issued
- Separation of Concerns
- CSI → semantics (SSSR)
- CMI → execution (ZAR)
- USO → runtime lineage
- Deterministic Traceability
- Every data point can be traced through its full processing chain
- Human + Machine Readability
- Identifiers are structured for both audit interpretation and system parsing
- Documentation-Linked Identity
- COMPONENT_ID aligns with frontmatter IDs, enabling direct traceability to system specifications
- Compliance by Design
- Supports:
- CSRD audit trails
- ESRS data quality requirements
- ISO 14064 traceability
8. Strategic Outcome - CIA
The Canonical Identifier Architecture transforms ZAYAZ into a system where:
Every ESG data point is uniquely identifiable, semantically defined, and fully traceable across its entire lifecycle.
This enables:
- deterministic ESG reporting
- verifiable supply chain transparency
- AI-assisted explainability
- regulatory-grade auditability
9. ZAR Data Model & Provenance Architecture
9.1. Overview
The ZAYAZ Artifact Registry (ZAR) Data Model defines how artifacts, signal instances, and their relationships are stored, linked, and governed across the platform.
It provides the structural foundation for:
- artifact registration (CMI)
- signal typing (CSI via SSSR)
- runtime lineage (USO)
- routing validation and policy enforcement (ZSSR)
The model is designed to ensure:
- full traceability
- deterministic replay
- separation of design-time and runtime concerns
- audit-grade data integrity
9.2. Architectural Separation
The ZAR data model is built on three distinct layers:
Layer Responsibilities
| Layer | Responsibility |
|---|---|
| Design-Time | Defines signals (CSI) and artifacts (CMI) |
| Runtime | Tracks signal instances and lineage |
| Policy | Governs allowed processing paths |
9.3. Design-Time: Artifact Registry (ZAR)
The ZAR registry maintains a catalog of all executable and structural artifacts in the system.
9.3.1. zar_cmi_registry
Stores all registered artifacts.
| Field | Type | Description |
|---|---|---|
| cmi_id | PK | Internal identifier |
| cmi_name | text | Full CMI (<module_code>.<COMPONENT_ID>.<CMI_KIND>.<ARTIFACT_NAME>.<MAJOR>_<MINOR>_<PATCH>) |
| module_code | text | e.g. comp, vera, inpt |
| component_id | text | Frontmatter ID (e.g. PEF-ME) |
| kind | text | ENGINE, SCHEMA, MODEL, RULESET, etc. |
| name | text | Artifact name |
| version | text | 1_0_0 or date-based |
| zar_code | text | Short Base32 identifier |
| description | text | Human-readable description |
| owner_team | text | Responsible team |
| runtime_class | text | Execution reference (container/class) |
| status | text | active / deprecated |
9.3.2. zar_cmi_alias
Optional alias mapping for human or legacy references.
| Field | Description |
|---|---|
| alias | Short alternative name |
| cmi_name | FK to registry |
| scope | human / internal / legacy |
9.3.3. zar_cmi_capabilities
Defines capabilities of each artifact.
| Field | Description |
|---|---|
| cmi_name | FK to registry |
| capability | e.g. ocr, validation, merkle_verify |
| value | JSON configuration |
9.4. Runtime Layer: Signal Provenance
Runtime data is modeled at the signal instance level, ensuring that each occurrence of a signal can follow a unique processing path.
9.4.1. signal_instances
Represents each individual signal occurrence (USO layer).
| Field | Type | Description |
|---|---|---|
| signal_instance_id | PK | UUID / ULID |
| csi | text | Canonical Signal Identifier |
| sssr_id | text | Optional SSSR reference |
| uso_name | text | Semantic class |
| born_at | timestamptz | Creation timestamp |
| source_ref | jsonb | External reference (file, tx_id, etc.) |
9.4.2. signal_lineage_events
Tracks all processing steps applied to a signal instance.
| Field | Type | Description |
|---|---|---|
| event_id | PK | Unique event ID |
| signal_instance_id | FK | Linked signal instance |
| cmi_name | FK | Processing artifact |
| occurred_at | timestamptz | Timestamp |
| input_refs | jsonb | Upstream references |
| output_refs | jsonb | Derived outputs |
| params | jsonb | Execution parameters |
| metrics | jsonb | Runtime metrics |
| trust_context_id | text | Validation context |
9.4.3. trust_context_snapshots
Stores validation and scoring context.
| Field | Description |
|---|---|
| trust_context_id | PK |
| effective_weights | JSON |
| thresholds | JSON |
| modifiers | JSON |
| version_tag | text |
| captured_at | timestamp |
9.5. Policy Layer: Routing & Governance
The policy layer defines which artifacts are allowed to process specific signal types.
9.5.1. zar_signal_policy
Defines allowed or restricted processing paths.
| Field | Description |
|---|---|
| uso_name | Semantic class |
| csi_prefix | <MODULE>.<COMPONENT_ID>.<KIND> |
| allowed_cmi | Array of permitted artifacts |
| deny_cmi | Array of restricted artifacts |
| notes | Policy explanation |
| version | Policy version |
9.6. End-to-End Data Flow
The complete data model supports a deterministic lifecycle:
9.7. Key Properties of the Model
- Separation of Concerns
- SSSR → semantics (CSI)
- ZAR → artifacts (CMI)
- Runtime → lineage (USO)
- Instance-Level Lineage
Each signal instance has its own processing history.
- Many-to-Many Relationships
- Signals can pass through multiple artifacts
- Artifacts can process multiple signals
- Replayability
Full reconstruction of any signal path is possible.
- Policy vs History Separation
- Policies define allowed behavior
- Runtime stores actual behavior
9.8. Example Query Patterns
Lineage Trace
SELECT e.occurred_at, e.cmi_name, e.params
FROM signal_instances i
JOIN signal_lineage_events e
ON e.signal_instance_id = i.signal_instance_id
WHERE i.source_ref->>'tx_id' = :tx_id
ORDER BY e.occurred_at;
Signal Discovery
SELECT signal_instance_id, csi, born_at
FROM signal_instances
WHERE uso_name = :uso_name;
Policy Violation Detection
SELECT DISTINCT e.cmi_name
FROM signal_instances i
JOIN signal_lineage_events e
ON e.signal_instance_id = i.signal_instance_id
LEFT JOIN zar_signal_policy p
ON p.uso_name = i.uso_name
WHERE i.uso_name = :uso
AND (p.allowed_cmi IS NOT NULL
AND NOT (e.cmi_name = ANY(p.allowed_cmi)));
9.9. Strategic Outcome - Data Model
This data model enables ZAYAZ to operate as a fully traceable ESG computation infrastructure, where:
- every signal instance is uniquely identifiable
- every transformation is recorded
- every artifact is governed
- every decision is auditable
ZAR + SSSR + USO together form a deterministic ESG lineage system capable of supporting regulatory-grade assurance and AI-driven explainability at scale.
10 Graph Model & Knowledge Layer (Derived from ZAR)
While the relational ZAR data model provides the authoritative source of truth for artifacts, signal instances, lineage events, and policies, many of the most valuable ZAYAZ use cases require graph-native traversal rather than table-by-table querying.
Examples include:
- tracing all upstream dependencies of a disclosure
- finding which artifacts influenced a reported KPI
- identifying all signals touched by a deprecated component
- enabling ZARA to explain how a value was derived
- federating lineage across multiple E-C-O entities and assurance domains
For these purposes, ZAYAZ should maintain a derived graph model on top of the relational core.
The relational model remains the canonical persistence layer.
The graph model acts as the traversal, reasoning, and explainability layer.
10.1. Why a Graph Layer Is Needed
Relational tables are ideal for:
- integrity
- transactional writes
- schema governance
- audit logging
- deterministic persistence
Graphs are ideal for:
- multi-hop lineage traversal
- dependency analysis
- impact analysis
- explanation generation
- policy path validation
- cross-entity federation
A graph layer therefore gives ZAYAZ the ability to move from simple recordkeeping to navigable ESG intelligence.
10.2. Graph Design Principle
The graph model must follow one strict rule:
No graph node or edge may exist without a corresponding canonical source in the relational model.
This means:
- CSI nodes derive from SSSR
- CMI nodes derive from ZAR
- signal instance nodes derive from
signal_instances - lineage edges derive from
signal_lineage_events - policy edges derive from
zar_signal_policy - entity context derives from E-C-O references and runtime context
The graph is therefore:
- derived
- rebuildable
- verifiable
- non-authoritative for writes
10.3. Core Graph Entity Types
The graph model should contain the following primary node types.
A) SignalType
Represents a canonical semantic signal definition.
Derived from: SSSR Primary key: csi
Examples:
comp.PEF-ME.OUTPUT.CO2E.v1_0vera.TG-CORE.OUTPUT.TRUST_SCORE.v1_0
B) Artifact
Represents a registered executable or structural artifact.
Derived from: zar_cmi_registry
Primary key: cmi_name
Examples:
comp.PEF-ME.ENGINE.CORE.1_0_0vera.TG-CORE.ENGINE.VALIDATOR.1_0_0
C) SignalInstance
Represents one runtime occurrence of a signal.
Derived from: signal_instances
Primary key: signal_instance_id / uso_id
D) LineageEvent
Represents a processing event in which an artifact touched or transformed a signal instance.
Derived from: signal_lineage_events
Primary key: event_id
E) Policy
Represents an allowed or denied routing/processing rule.
Derived from: zar_signal_policy
F) Entity
Represents a company, supplier, verifier, authority, or other context-bearing actor.
Derived from: E-C-O registry / runtime context
Primary key: eco_number
G) DocumentedComponent
Represents the manual-defined component referenced via frontmatter ID.
Derived from: component documentation / MDX frontmatter
Primary key: component_id
This node type is especially important because it connects the runtime system to the written specification and enables ZARA to explain components directly from the manual.
10.4. Core Relationship Types
The graph should model the following relationships.
| Relationship | From | To | Meaning |
|---|---|---|---|
| INSTANCE_OF | SignalInstance | SignalType | This runtime instance is of this signal type |
| PRODUCED_BY | SignalInstance | Artifact | This signal instance was first produced by this artifact |
| TOUCHED_BY | SignalInstance | Artifact | This artifact processed or evaluated this instance |
| RECORDED_IN | SignalInstance | LineageEvent | This event belongs to this instance |
| USED_ARTIFACT | LineageEvent | Artifact | This event executed this artifact |
| INPUT_TO | SignalInstance | LineageEvent | This signal instance was an input to this event |
| OUTPUT_FROM | SignalInstance | LineageEvent | This signal instance was produced from this event |
| ALLOWED_FOR | Policy | Artifact | This policy permits the artifact |
| DENIED_FOR | Policy | Artifact | This policy denies the artifact |
| APPLIES_TO | Policy | SignalType | This policy applies to this signal type or prefix |
| CONTEXT_OF | SignalInstance | Entity | This signal belongs to this E-C-O context |
| DOCUMENTED_AS | Artifact | DocumentedComponent | This artifact corresponds to this documented component |
| DEFINES | DocumentedComponent | SignalType | This documented component defines or emits this signal type |
| DEPENDS_ON | Artifact | Artifact | This artifact depends on another artifact |
10.5. Conceptual Graph Structure
10.6. Graph Projection from the Relational Model
The graph should be materialized through a controlled projection pipeline.
Source tables
| Relational Source | Graph Output |
|---|---|
| sssr_signals / signal registry | SignalType nodes |
| zar_cmi_registry | Artifact nodes |
| signal_instances | SignalInstance nodes |
| signal_lineage_events | LineageEvent nodes and traversal edges |
| zar_signal_policy | Policy nodes and policy edges |
| component frontmatter / docs index | DocumentedComponent nodes |
| E-C-O registry / runtime context | Entity nodes |
Projection rules
- Every registered CSI becomes one
SignalTypenode. - Every registered CMI becomes one
Artifactnode. - Every runtime signal instance becomes one
SignalInstancenode. - Every lineage event becomes one
LineageEventnode. - Every
SignalInstanceis connected to exactly oneSignalType. - Every producing artifact is linked using
PRODUCED_BY. - Every processing event creates
USED_ARTIFACT,INPUT_TO, andOUTPUT_FROMedges where applicable. - Every documented component becomes a DocumentedComponent node keyed by frontmatter ID.
- Every artifact with a matching
component_idis connected to the documented component viaDOCUMENTED_AS.
10.7. Why the Documentation Node Matters
The DocumentedComponent node is a strategic advantage for ZAYAZ.
It allows the platform to connect:
- runtime identity
- artifact identity
- semantic signal identity
- architecture documentation
- AI explanation context
This means ZARA can answer questions such as:
- “What does PEF-ME do?”
- “Why did this artifact generate CO2E?”
- “Which documented module is responsible for this trust score?”
- “What policies or assumptions are defined for this component?”
Without this link, documentation remains passive. With this link, the manual becomes an active reasoning surface.
10.8. Example Graph Traversal Use Cases
A) Explain a reported disclosure
Start from a reported KPI and traverse:
SignalType ← SignalInstance ← LineageEvent ← Artifact ← DocumentedComponent
This supports board-ready and verifier-ready explanation flows.
B) Impact analysis for artifact deprecation
Start from a deprecated Artifact node and traverse to:
- all SignalType nodes it emits or transforms
- all SignalInstance nodes affected historically
- all Policy nodes that reference it
- all disclosures downstream
This supports safe migration and governance.
C) Federated assurance traversal
Start from an Entity node and traverse across:
- all signal instances in context
- all artifacts that processed them
- all trust and assurance events
- all linked verifier outputs
This becomes a foundation for cross-ECO assurance and federation. 
10.9. Suggested Graph Properties
SignalType
csimodule_codecomponent_idkindnameversionstatusvalue_typeunitframework_tags
Artifact
cmi_namemodule_codecomponent_idkindnameversionzar_codestatusowner_teamruntime_class
SignalInstance
signal_instance_iduso_idcsiborn_atsource_reftrust_context_id
LineageEvent
event_idoccurred_atparamsmetricstrust_context_id
Policy
policy_keyuso_namecsi_prefixversionnotes
Entity
eco_numberentity_typejurisdictionstatus
DocumentedComponent
component_idmodule_codetitledoc_pathstatusowner_team
10.10. Recommended Graph Storage Strategy
ZAYAZ should treat the graph as a derived operational knowledge layer, not as a replacement for the relational core.
Recommended pattern:
- Write canonical records to relational stores first.
- Publish projection events when CSI, CMI, signal instance, lineage event, or policy records change.
- Update the graph projection asynchronously.
- Rebuild the graph from canonical stores when required.
This gives us:
- integrity from the relational layer
- speed and flexibility from the graph layer
- safe rebuildability
- deterministic explainability
10.11. Governance Rules for the Graph Layer
The graph layer must obey the following rules:
-
Derived Only No manual graph-only nodes or edges.
-
Source Traceability Every node and edge must store source references to relational origin.
-
Rebuildability The graph must be reconstructible from canonical tables.
-
Version Awareness CSI and CMI versions must be preserved as explicit properties.
-
Tenant / Entity Isolation Cross-entity traversals must obey RBAC, verifier permissions, and federation policies.
-
AI Explainability Compatibility Nodes and edges should remain interpretable by ZARA and ZAAM.
10.12. Strategic Outcome
By adding a graph layer on top of the ZAR relational model, ZAYAZ gains a new capability:
Not only storing ESG lineage, but navigating, explaining, and reasoning across it.
This turns ZAR from a registry into a live assurance intelligence fabric.
The relational model guarantees integrity. The graph model unlocks traversal. Together, they form the foundation for:
- explainable ESG intelligence
- cross-module dependency mapping
- advanced traceability
- federated assurance networks
- documentation-aware AI guidance