Skip to main content
Jira progress: loading…

CIA

Canonical Identifier Architecture

The Canonical Identifier Architecture (CIA) defines the unified identity system used across the ZAYAZ platform to ensure traceability, interoperability, and auditability of all ESG data, computations, and artifacts.

It establishes a deterministic and immutable naming framework that connects:

  • What a signal represents (semantic identity)
  • Which component produced or processed it (artifact identity)
  • Which specific instance is being observed (runtime identity)

Together, these identifiers form the ZAYAZ Identifier Trinity, enabling full lifecycle traceability from raw input to verified disclosure.


1. The Identifier Trinity

ZAYAZ distinguishes between three distinct but interlinked identity layers:

LayerIdentifierRegistryPurpose
InstanceUSO IDUSO (runtime)Identifies a unique occurrence of a signal
TypeCSI (Canonical Signal Identifier)SSSRDefines the semantic type of the signal
ArtifactCMI (Canonical Managed Identifier)ZARIdentifies the component (engine/module) that produced or processed it

Key Principle

Only the USO ID is created at runtime. CSI and CMI are pre-defined, immutable identifiers reused across all instances.

This separation ensures:

  • deterministic lineage
  • replayable audit trails
  • strict separation of data, semantics, and execution

2. Canonical Signal Identifier (CSI)

The CSI defines the semantic identity of a signal. It is stored and governed within the SSSR (Smart Searchable Signal Registry) and assigned at the column or schema level.

Purpose

  • Define what the signal is
  • Provide stable semantic meaning across the platform
  • Enable discoverability, mapping, and regulatory alignment

NOTE: module_code is the primary routing namespace

This enables:

  • ZSSR routing
  • scaling across clusters
  • federation (EGFS)
  • permission scoping
  • billing segmentation

Example:

comp.* → Computation cluster
vera.* → Verification pipelines
netz.* → Climate modeling services

CSI Format Specification

<module_code>.<COMPONENT_ID>.<KIND>.<NAME>.v<MAJOR>_<MINOR>


Segment Definitions

SegmentDescription
MODULE_CODETop-level ZAYAZ module (e.g. comp, vera, inpt, netz, risk)
COMPONENT_IDUnique frontmatter ID of the module/component (e.g. PEF-ME, ZAR-FW, TG-CORE)
KINDRole of the signal (INPUT, OUTPUT, SIGNAL, METRIC, FEATURE, SCHEMA, CONFIG, EVENT, VIEW)
NAMECanonical semantic identifier (uppercase, underscore-separated)
VERSIONSemantic version of the signal definition (major_minor), prefixed with "v"

Examples

comp.PEF-ME.OUTPUT.CO2E.v1_0
vera.TG-CORE.OUTPUT.TRUST_SCORE.v1_0
inpt.FOGE-FORM.INPUT.WATER_USE.v1_0
netz.DECARB-MODEL.METRIC.ABATEMENT_COST.v2_0
risk.RIF-ENGINE.EVENT.RISK_ALERT.v1_1

Design Rules

  • CSI is immutable once published
  • Semantic changes require a major version increment
  • Minor metadata changes increment minor version only
  • COMPONENT_ID must match the frontmatter ID in the ZAYAZ manual
  • All segments must be machine-parseable and globally unique in combination

Important Constraint

CSI does not have its own registry. It is defined and stored within the SSSR signal registry, where each signal field is bound to a specific CSI.


3. Canonical Managed Identifier (CMI)

The CMI defines the identity of a code artifact (engine, schema, model, ruleset, etc.) and is governed within the ZAR (Artifact Registry).

Purpose

  • Identify who produced or processed the signal
  • Enable reproducibility of computations
  • Support lineage tracking and auditability

CMI Format Specification

<module_code>.<COMPONENT_ID>.<CMI_KIND>.<ARTIFACT_NAME>.<MAJOR>_<MINOR>_<PATCH>


Segment Definitions

SegmentDescription
moduleSame as CSI
COMPONENT_IDSame frontmatter ID as CSI (shared reference point)
KINDArtifact category (ENGINE, SCHEMA, MODEL, RULESET, CONNECTOR, UI, JOB, LIB, TEST)
NAMEArtifact or sub-function name
VERSIONSemantic version (major_minor_patch) or date-based version

Examples

comp.PEF-ME.ENGINE.CORE.1_1_0
comp.TG-CORE.ENGINE.VALIDATOR.1_0_0
siss.ROUTER.RULESET.INVOICE_LINES.2025_10_01
inpt.FOGE.UI.FORM.TRUST_REVIEW.1_2_0

ZAR Code (Short Identifier)

Each CMI is assigned a ZAR Code:

  • Base32 short code (4–6 characters)
  • Derived deterministically (e.g. from SHA-256)

Example:

TG3K7, MIE12, DSA9Q

Purpose:

  • Efficient lineage tracking
  • Compact routing (ZSSR)
  • Provenance chain representation

4. Universal Signal Ontology (USO) Identifier

The USO ID represents a specific runtime instance of a signal.

Purpose

  • Provide instance-level traceability
  • Enable full replay of ESG data flows
  • Anchor provenance chains

Characteristics

  • Generated at runtime (ULID / UUIDv7)
  • Globally unique
  • Immutable
  • Linked to:
    • CSI (signal type)
    • CMI (producing artifact)
    • origin_chain (processing history)

Example

uso_id = 01JBF0W8S9Q0R1S2T3U4V5W6X


Core Fields

FieldDescription
uso_idUnique instance identifier
csiSignal type (from SSSR)
primary_origin_cmiProducing artifact (from ZAR)
origin_chainOrdered list of CMIs
origin_chain_codesZAR short codes
born_atTimestamp
context.eco_numbersOptional entity references

5. Identifier Relationships

The three identifiers work together as follows:

USO (instance)

CSI (what it is)

CMI (who produced it)

Illustrative Example

LayerIdentifierExampleDescription
Artifact (who)CMICOMP.PEF-ME.ENGINE.CORE.1_1_0Engine producing the signal
Signal Type (what)CSICOMP.PEF-ME.OUTPUT.CO2E.v1_0Canonical signal definition
Instance (which)USO ID01JBF0W8S9Q0R1S2T3U4V5W6XUnique lineage instance
RoutingZAR CodeMIE12Compact artifact reference
Provenanceorigin_chain_codes[MIE12, TG3K7]Processing trail

6. Canonical Signal Creation Flow

When a signal is generated, the ZAYAZ platform performs the following sequence:

  1. A module completes a computation or ingestion process
  2. A new USO ID is created
  3. The producing component’s CMI is assigned (from ZAR)
  4. The corresponding CSI is assigned (from SSSR)
  5. The origin_chain is initialized with the producing CMI

Result

Every signal instance is fully described by:

  • USO ID → instance identity
  • CSI → semantic meaning
  • CMI → execution provenance

7. Design Principles

  1. Immutability of Identifiers
  • CSI, CMI, and USO IDs are never altered once issued
  1. Separation of Concerns
  • CSI → semantics (SSSR)
  • CMI → execution (ZAR)
  • USO → runtime lineage
  1. Deterministic Traceability
  • Every data point can be traced through its full processing chain
  1. Human + Machine Readability
  • Identifiers are structured for both audit interpretation and system parsing
  1. Documentation-Linked Identity
  • COMPONENT_ID aligns with frontmatter IDs, enabling direct traceability to system specifications
  1. Compliance by Design
  • Supports:
    • CSRD audit trails
    • ESRS data quality requirements
    • ISO 14064 traceability

8. Strategic Outcome - CIA

The Canonical Identifier Architecture transforms ZAYAZ into a system where:

Every ESG data point is uniquely identifiable, semantically defined, and fully traceable across its entire lifecycle.

This enables:

  • deterministic ESG reporting
  • verifiable supply chain transparency
  • AI-assisted explainability
  • regulatory-grade auditability

9. ZAR Data Model & Provenance Architecture

9.1. Overview

The ZAYAZ Artifact Registry (ZAR) Data Model defines how artifacts, signal instances, and their relationships are stored, linked, and governed across the platform.

It provides the structural foundation for:

  • artifact registration (CMI)
  • signal typing (CSI via SSSR)
  • runtime lineage (USO)
  • routing validation and policy enforcement (ZSSR)

The model is designed to ensure:

  • full traceability
  • deterministic replay
  • separation of design-time and runtime concerns
  • audit-grade data integrity

9.2. Architectural Separation

The ZAR data model is built on three distinct layers:


Layer Responsibilities

LayerResponsibility
Design-TimeDefines signals (CSI) and artifacts (CMI)
RuntimeTracks signal instances and lineage
PolicyGoverns allowed processing paths

9.3. Design-Time: Artifact Registry (ZAR)

The ZAR registry maintains a catalog of all executable and structural artifacts in the system.

9.3.1. zar_cmi_registry

Stores all registered artifacts.

FieldTypeDescription
cmi_idPKInternal identifier
cmi_nametextFull CMI (<module_code>.<COMPONENT_ID>.<CMI_KIND>.<ARTIFACT_NAME>.<MAJOR>_<MINOR>_<PATCH>)
module_codetexte.g. comp, vera, inpt
component_idtextFrontmatter ID (e.g. PEF-ME)
kindtextENGINE, SCHEMA, MODEL, RULESET, etc.
nametextArtifact name
versiontext1_0_0 or date-based
zar_codetextShort Base32 identifier
descriptiontextHuman-readable description
owner_teamtextResponsible team
runtime_classtextExecution reference (container/class)
statustextactive / deprecated

9.3.2. zar_cmi_alias

Optional alias mapping for human or legacy references.

FieldDescription
aliasShort alternative name
cmi_nameFK to registry
scopehuman / internal / legacy

9.3.3. zar_cmi_capabilities

Defines capabilities of each artifact.

FieldDescription
cmi_nameFK to registry
capabilitye.g. ocr, validation, merkle_verify
valueJSON configuration


9.4. Runtime Layer: Signal Provenance

Runtime data is modeled at the signal instance level, ensuring that each occurrence of a signal can follow a unique processing path.


9.4.1. signal_instances

Represents each individual signal occurrence (USO layer).

FieldTypeDescription
signal_instance_idPKUUID / ULID
csitextCanonical Signal Identifier
sssr_idtextOptional SSSR reference
uso_nametextSemantic class
born_attimestamptzCreation timestamp
source_refjsonbExternal reference (file, tx_id, etc.)

9.4.2. signal_lineage_events

Tracks all processing steps applied to a signal instance.

FieldTypeDescription
event_idPKUnique event ID
signal_instance_idFKLinked signal instance
cmi_nameFKProcessing artifact
occurred_attimestamptzTimestamp
input_refsjsonbUpstream references
output_refsjsonbDerived outputs
paramsjsonbExecution parameters
metricsjsonbRuntime metrics
trust_context_idtextValidation context

9.4.3. trust_context_snapshots

Stores validation and scoring context.

FieldDescription
trust_context_idPK
effective_weightsJSON
thresholdsJSON
modifiersJSON
version_tagtext
captured_attimestamp


9.5. Policy Layer: Routing & Governance

The policy layer defines which artifacts are allowed to process specific signal types.


9.5.1. zar_signal_policy

Defines allowed or restricted processing paths.

FieldDescription
uso_nameSemantic class
csi_prefix<MODULE>.<COMPONENT_ID>.<KIND>
allowed_cmiArray of permitted artifacts
deny_cmiArray of restricted artifacts
notesPolicy explanation
versionPolicy version


9.6. End-to-End Data Flow

The complete data model supports a deterministic lifecycle:


9.7. Key Properties of the Model

  1. Separation of Concerns
  • SSSR → semantics (CSI)
  • ZAR → artifacts (CMI)
  • Runtime → lineage (USO)
  1. Instance-Level Lineage

Each signal instance has its own processing history.

  1. Many-to-Many Relationships
  • Signals can pass through multiple artifacts
  • Artifacts can process multiple signals
  1. Replayability

Full reconstruction of any signal path is possible.

  1. Policy vs History Separation
  • Policies define allowed behavior
  • Runtime stores actual behavior

9.8. Example Query Patterns

Lineage Trace

lineage-trace.sqlGitHub ↗
SELECT e.occurred_at, e.cmi_name, e.params
FROM signal_instances i
JOIN signal_lineage_events e
ON e.signal_instance_id = i.signal_instance_id
WHERE i.source_ref->>'tx_id' = :tx_id
ORDER BY e.occurred_at;

Signal Discovery

signal-discovery.sqlGitHub ↗
SELECT signal_instance_id, csi, born_at
FROM signal_instances
WHERE uso_name = :uso_name;

Policy Violation Detection

policy-violation-detection.sqlGitHub ↗
SELECT DISTINCT e.cmi_name
FROM signal_instances i
JOIN signal_lineage_events e
ON e.signal_instance_id = i.signal_instance_id
LEFT JOIN zar_signal_policy p
ON p.uso_name = i.uso_name
WHERE i.uso_name = :uso
AND (p.allowed_cmi IS NOT NULL
AND NOT (e.cmi_name = ANY(p.allowed_cmi)));

9.9. Strategic Outcome - Data Model

This data model enables ZAYAZ to operate as a fully traceable ESG computation infrastructure, where:

  • every signal instance is uniquely identifiable
  • every transformation is recorded
  • every artifact is governed
  • every decision is auditable

ZAR + SSSR + USO together form a deterministic ESG lineage system capable of supporting regulatory-grade assurance and AI-driven explainability at scale.


10 Graph Model & Knowledge Layer (Derived from ZAR)

While the relational ZAR data model provides the authoritative source of truth for artifacts, signal instances, lineage events, and policies, many of the most valuable ZAYAZ use cases require graph-native traversal rather than table-by-table querying.

Examples include:

  • tracing all upstream dependencies of a disclosure
  • finding which artifacts influenced a reported KPI
  • identifying all signals touched by a deprecated component
  • enabling ZARA to explain how a value was derived
  • federating lineage across multiple E-C-O entities and assurance domains

For these purposes, ZAYAZ should maintain a derived graph model on top of the relational core.

The relational model remains the canonical persistence layer.
The graph model acts as the traversal, reasoning, and explainability layer.


10.1. Why a Graph Layer Is Needed

Relational tables are ideal for:

  • integrity
  • transactional writes
  • schema governance
  • audit logging
  • deterministic persistence

Graphs are ideal for:

  • multi-hop lineage traversal
  • dependency analysis
  • impact analysis
  • explanation generation
  • policy path validation
  • cross-entity federation

A graph layer therefore gives ZAYAZ the ability to move from simple recordkeeping to navigable ESG intelligence.


10.2. Graph Design Principle

The graph model must follow one strict rule:

No graph node or edge may exist without a corresponding canonical source in the relational model.

This means:

  • CSI nodes derive from SSSR
  • CMI nodes derive from ZAR
  • signal instance nodes derive from signal_instances
  • lineage edges derive from signal_lineage_events
  • policy edges derive from zar_signal_policy
  • entity context derives from E-C-O references and runtime context

The graph is therefore:

  • derived
  • rebuildable
  • verifiable
  • non-authoritative for writes

10.3. Core Graph Entity Types

The graph model should contain the following primary node types.

A) SignalType

Represents a canonical semantic signal definition.

Derived from: SSSR Primary key: csi

Examples:

  • comp.PEF-ME.OUTPUT.CO2E.v1_0
  • vera.TG-CORE.OUTPUT.TRUST_SCORE.v1_0

B) Artifact

Represents a registered executable or structural artifact.

Derived from: zar_cmi_registry Primary key: cmi_name

Examples:

  • comp.PEF-ME.ENGINE.CORE.1_0_0
  • vera.TG-CORE.ENGINE.VALIDATOR.1_0_0

C) SignalInstance

Represents one runtime occurrence of a signal.

Derived from: signal_instances Primary key: signal_instance_id / uso_id


D) LineageEvent

Represents a processing event in which an artifact touched or transformed a signal instance.

Derived from: signal_lineage_events Primary key: event_id


E) Policy

Represents an allowed or denied routing/processing rule.

Derived from: zar_signal_policy


F) Entity

Represents a company, supplier, verifier, authority, or other context-bearing actor.

Derived from: E-C-O registry / runtime context Primary key: eco_number


G) DocumentedComponent

Represents the manual-defined component referenced via frontmatter ID.

Derived from: component documentation / MDX frontmatter Primary key: component_id

This node type is especially important because it connects the runtime system to the written specification and enables ZARA to explain components directly from the manual.


10.4. Core Relationship Types

The graph should model the following relationships.

RelationshipFromToMeaning
INSTANCE_OFSignalInstanceSignalTypeThis runtime instance is of this signal type
PRODUCED_BYSignalInstanceArtifactThis signal instance was first produced by this artifact
TOUCHED_BYSignalInstanceArtifactThis artifact processed or evaluated this instance
RECORDED_INSignalInstanceLineageEventThis event belongs to this instance
USED_ARTIFACTLineageEventArtifactThis event executed this artifact
INPUT_TOSignalInstanceLineageEventThis signal instance was an input to this event
OUTPUT_FROMSignalInstanceLineageEventThis signal instance was produced from this event
ALLOWED_FORPolicyArtifactThis policy permits the artifact
DENIED_FORPolicyArtifactThis policy denies the artifact
APPLIES_TOPolicySignalTypeThis policy applies to this signal type or prefix
CONTEXT_OFSignalInstanceEntityThis signal belongs to this E-C-O context
DOCUMENTED_ASArtifactDocumentedComponentThis artifact corresponds to this documented component
DEFINESDocumentedComponentSignalTypeThis documented component defines or emits this signal type
DEPENDS_ONArtifactArtifactThis artifact depends on another artifact

10.5. Conceptual Graph Structure


10.6. Graph Projection from the Relational Model

The graph should be materialized through a controlled projection pipeline.

Source tables

Relational SourceGraph Output
sssr_signals / signal registrySignalType nodes
zar_cmi_registryArtifact nodes
signal_instancesSignalInstance nodes
signal_lineage_eventsLineageEvent nodes and traversal edges
zar_signal_policyPolicy nodes and policy edges
component frontmatter / docs indexDocumentedComponent nodes
E-C-O registry / runtime contextEntity nodes

Projection rules

  1. Every registered CSI becomes one SignalType node.
  2. Every registered CMI becomes one Artifact node.
  3. Every runtime signal instance becomes one SignalInstance node.
  4. Every lineage event becomes one LineageEvent node.
  5. Every SignalInstance is connected to exactly one SignalType.
  6. Every producing artifact is linked using PRODUCED_BY.
  7. Every processing event creates USED_ARTIFACT, INPUT_TO, and OUTPUT_FROM edges where applicable.
  8. Every documented component becomes a DocumentedComponent node keyed by frontmatter ID.
  9. Every artifact with a matching component_id is connected to the documented component via DOCUMENTED_AS.

10.7. Why the Documentation Node Matters

The DocumentedComponent node is a strategic advantage for ZAYAZ.

It allows the platform to connect:

  • runtime identity
  • artifact identity
  • semantic signal identity
  • architecture documentation
  • AI explanation context

This means ZARA can answer questions such as:

  • “What does PEF-ME do?”
  • “Why did this artifact generate CO2E?”
  • “Which documented module is responsible for this trust score?”
  • “What policies or assumptions are defined for this component?”

Without this link, documentation remains passive. With this link, the manual becomes an active reasoning surface.


10.8. Example Graph Traversal Use Cases

A) Explain a reported disclosure

Start from a reported KPI and traverse:

SignalType ← SignalInstance ← LineageEvent ← Artifact ← DocumentedComponent

This supports board-ready and verifier-ready explanation flows.


B) Impact analysis for artifact deprecation

Start from a deprecated Artifact node and traverse to:

  • all SignalType nodes it emits or transforms
  • all SignalInstance nodes affected historically
  • all Policy nodes that reference it
  • all disclosures downstream

This supports safe migration and governance.


C) Federated assurance traversal

Start from an Entity node and traverse across:

  • all signal instances in context
  • all artifacts that processed them
  • all trust and assurance events
  • all linked verifier outputs

This becomes a foundation for cross-ECO assurance and federation. 


10.9. Suggested Graph Properties

SignalType

  • csi
  • module_code
  • component_id
  • kind
  • name
  • version
  • status
  • value_type
  • unit
  • framework_tags

Artifact

  • cmi_name
  • module_code
  • component_id
  • kind
  • name
  • version
  • zar_code
  • status
  • owner_team
  • runtime_class

SignalInstance

  • signal_instance_id
  • uso_id
  • csi
  • born_at
  • source_ref
  • trust_context_id

LineageEvent

  • event_id
  • occurred_at
  • params
  • metrics
  • trust_context_id

Policy

  • policy_key
  • uso_name
  • csi_prefix
  • version
  • notes

Entity

  • eco_number
  • entity_type
  • jurisdiction
  • status

DocumentedComponent

  • component_id
  • module_code
  • title
  • doc_path
  • status
  • owner_team

ZAYAZ should treat the graph as a derived operational knowledge layer, not as a replacement for the relational core.

Recommended pattern:

  1. Write canonical records to relational stores first.
  2. Publish projection events when CSI, CMI, signal instance, lineage event, or policy records change.
  3. Update the graph projection asynchronously.
  4. Rebuild the graph from canonical stores when required.

This gives us:

  • integrity from the relational layer
  • speed and flexibility from the graph layer
  • safe rebuildability
  • deterministic explainability

10.11. Governance Rules for the Graph Layer

The graph layer must obey the following rules:

  1. Derived Only No manual graph-only nodes or edges.

  2. Source Traceability Every node and edge must store source references to relational origin.

  3. Rebuildability The graph must be reconstructible from canonical tables.

  4. Version Awareness CSI and CMI versions must be preserved as explicit properties.

  5. Tenant / Entity Isolation Cross-entity traversals must obey RBAC, verifier permissions, and federation policies.

  6. AI Explainability Compatibility Nodes and edges should remain interpretable by ZARA and ZAAM.


10.12. Strategic Outcome

By adding a graph layer on top of the ZAR relational model, ZAYAZ gains a new capability:

Not only storing ESG lineage, but navigating, explaining, and reasoning across it.

This turns ZAR from a registry into a live assurance intelligence fabric.

The relational model guarantees integrity. The graph model unlocks traversal. Together, they form the foundation for:

  • explainable ESG intelligence
  • cross-module dependency mapping
  • advanced traceability
  • federated assurance networks
  • documentation-aware AI guidance



GitHub RepoRequest for Change (RFC)