Jira progress: loading…

CIA

Canonical Identifier Architecture

The Canonical Identifier Architecture (CIA) defines the unified identity system used across the ZAYAZ platform to ensure traceability, interoperability, and auditability of all ESG data, computations, and artifacts.

It establishes a deterministic and immutable naming framework that connects:

What a signal represents (semantic identity)
Which component produced or processed it (artifact identity)
Which specific instance is being observed (runtime identity)

Together, these identifiers form the ZAYAZ Identifier Trinity, enabling full lifecycle traceability from raw input to verified disclosure.

1. The Identifier Trinity

ZAYAZ distinguishes between three distinct but interlinked identity layers:

Layer	Identifier	Registry	Purpose
Instance	USO ID	USO (runtime)	Identifies a unique occurrence of a signal
Type	CSI (Canonical Signal Identifier)	SSSR	Defines the semantic type of the signal
Artifact	CMI (Canonical Managed Identifier)	ZAR	Identifies the component (engine/module) that produced or processed it

Key Principle

Only the USO ID is created at runtime. CSI and CMI are pre-defined, immutable identifiers reused across all instances.

This separation ensures:

deterministic lineage
replayable audit trails
strict separation of data, semantics, and execution

2. Canonical Signal Identifier (CSI)

The CSI defines the semantic identity of a signal. It is stored and governed within the SSSR (Smart Searchable Signal Registry) and assigned at the column or schema level.

Purpose

Define what the signal is
Provide stable semantic meaning across the platform
Enable discoverability, mapping, and regulatory alignment

NOTE: module_code is the primary routing namespace

This enables:

ZSSR routing
scaling across clusters
federation (EGFS)
permission scoping
billing segmentation

Example:

comp.* → Computation cluster
vera.* → Verification pipelines
netz.* → Climate modeling services

CSI Format Specification

<module_code>.<COMPONENT_ID>.<KIND>.<NAME>.v<MAJOR>_<MINOR>

Segment Definitions

Segment	Description
MODULE_CODE	Top-level ZAYAZ module (e.g. comp, vera, inpt, netz, risk)
COMPONENT_ID	Unique frontmatter ID of the module/component (e.g. PEF-ME, ZAR-FW, TG-CORE)
KIND	Role of the signal (INPUT, OUTPUT, SIGNAL, METRIC, FEATURE, SCHEMA, CONFIG, EVENT, VIEW)
NAME	Canonical semantic identifier (uppercase, underscore-separated)
VERSION	Semantic version of the signal definition (major_minor), prefixed with "v"

Examples

comp.PEF-ME.OUTPUT.CO2E.v1_0
vera.TG-CORE.OUTPUT.TRUST_SCORE.v1_0
inpt.FOGE-FORM.INPUT.WATER_USE.v1_0
netz.DECARB-MODEL.METRIC.ABATEMENT_COST.v2_0
risk.RIF-ENGINE.EVENT.RISK_ALERT.v1_1

Design Rules

CSI is immutable once published
Semantic changes require a major version increment
Minor metadata changes increment minor version only
COMPONENT_ID must match the frontmatter ID in the ZAYAZ manual
All segments must be machine-parseable and globally unique in combination

Important Constraint

CSI does not have its own registry. It is defined and stored within the SSSR signal registry, where each signal field is bound to a specific CSI.

3. Canonical Managed Identifier (CMI)

The CMI defines the identity of a code artifact (engine, schema, model, ruleset, etc.) and is governed within the ZAR (Artifact Registry).

Purpose

Identify who produced or processed the signal
Enable reproducibility of computations
Support lineage tracking and auditability

CMI Format Specification

<module_code>.<COMPONENT_ID>.<CMI_KIND>.<ARTIFACT_NAME>.<MAJOR>_<MINOR>_<PATCH>

Segment Definitions

Segment	Description
module	Same as CSI
COMPONENT_ID	Same frontmatter ID as CSI (shared reference point)
KIND	Artifact category (ENGINE, SCHEMA, MODEL, RULESET, CONNECTOR, UI, JOB, LIB, TEST)
NAME	Artifact or sub-function name
VERSION	Semantic version (major_minor_patch) or date-based version

Examples

comp.PEF-ME.ENGINE.CORE.1_1_0
comp.TG-CORE.ENGINE.VALIDATOR.1_0_0
siss.ROUTER.RULESET.INVOICE_LINES.2025_10_01
inpt.FOGE.UI.FORM.TRUST_REVIEW.1_2_0

ZAR Code (Short Identifier)

Each CMI is assigned a ZAR Code:

Base32 short code (4–6 characters)
Derived deterministically (e.g. from SHA-256)

Example:

TG3K7, MIE12, DSA9Q

Purpose:

Efficient lineage tracking
Compact routing (ZSSR)
Provenance chain representation

4. Universal Signal Ontology (USO) Identifier

The USO ID represents a specific runtime instance of a signal.

Purpose

Provide instance-level traceability
Enable full replay of ESG data flows
Anchor provenance chains

Characteristics

Generated at runtime (ULID / UUIDv7)
Globally unique
Immutable
Linked to:
- CSI (signal type)
- CMI (producing artifact)
- origin_chain (processing history)

Example

uso_id = 01JBF0W8S9Q0R1S2T3U4V5W6X

Core Fields

Field	Description
uso_id	Unique instance identifier
csi	Signal type (from SSSR)
primary_origin_cmi	Producing artifact (from ZAR)
origin_chain	Ordered list of CMIs
origin_chain_codes	ZAR short codes
born_at	Timestamp
context.eco_numbers	Optional entity references

5. Identifier Relationships

The three identifiers work together as follows:

USO (instance)
   ↓
CSI (what it is)
   ↓
CMI (who produced it)

Illustrative Example

Layer	Identifier	Example	Description
Artifact (who)	CMI	COMP.PEF-ME.ENGINE.CORE.1_1_0	Engine producing the signal
Signal Type (what)	CSI	COMP.PEF-ME.OUTPUT.CO2E.v1_0	Canonical signal definition
Instance (which)	USO ID	01JBF0W8S9Q0R1S2T3U4V5W6X	Unique lineage instance
Routing	ZAR Code	MIE12	Compact artifact reference
Provenance	origin_chain_codes	[MIE12, TG3K7]	Processing trail

6. Canonical Signal Creation Flow

When a signal is generated, the ZAYAZ platform performs the following sequence:

A module completes a computation or ingestion process
A new USO ID is created
The producing component’s CMI is assigned (from ZAR)
The corresponding CSI is assigned (from SSSR)
The origin_chain is initialized with the producing CMI

Result

Every signal instance is fully described by:

USO ID → instance identity
CSI → semantic meaning
CMI → execution provenance

7. Design Principles

Immutability of Identifiers

CSI, CMI, and USO IDs are never altered once issued

Separation of Concerns

CSI → semantics (SSSR)
CMI → execution (ZAR)
USO → runtime lineage

Deterministic Traceability

Every data point can be traced through its full processing chain

Human + Machine Readability

Identifiers are structured for both audit interpretation and system parsing

Documentation-Linked Identity

COMPONENT_ID aligns with frontmatter IDs, enabling direct traceability to system specifications

Compliance by Design

Supports:
- CSRD audit trails
- ESRS data quality requirements
- ISO 14064 traceability

8. Strategic Outcome - CIA

The Canonical Identifier Architecture transforms ZAYAZ into a system where:

Every ESG data point is uniquely identifiable, semantically defined, and fully traceable across its entire lifecycle.

This enables:

deterministic ESG reporting
verifiable supply chain transparency
AI-assisted explainability
regulatory-grade auditability

9. ZAR Data Model & Provenance Architecture

9.1. Overview

The ZAYAZ Artifact Registry (ZAR) Data Model defines how artifacts, signal instances, and their relationships are stored, linked, and governed across the platform.

It provides the structural foundation for:

artifact registration (CMI)
signal typing (CSI via SSSR)
runtime lineage (USO)
routing validation and policy enforcement (ZSSR)

The model is designed to ensure:

full traceability
deterministic replay
separation of design-time and runtime concerns
audit-grade data integrity

9.2. Architectural Separation

The ZAR data model is built on three distinct layers:

Layer Responsibilities

Layer	Responsibility
Design-Time	Defines signals (CSI) and artifacts (CMI)
Runtime	Tracks signal instances and lineage
Policy	Governs allowed processing paths

9.3. Design-Time: Artifact Registry (ZAR)

The ZAR registry maintains a catalog of all executable and structural artifacts in the system.

9.3.1. zar_cmi_registry

Stores all registered artifacts.

Field	Type	Description
cmi_id	PK	Internal identifier
cmi_name	text	Full CMI (`<module_code>.<COMPONENT_ID>.<CMI_KIND>.<ARTIFACT_NAME>.<MAJOR>_<MINOR>_<PATCH>`)
module_code	text	e.g. comp, vera, inpt
component_id	text	Frontmatter ID (e.g. PEF-ME)
kind	text	ENGINE, SCHEMA, MODEL, RULESET, etc.
name	text	Artifact name
version	text	1_0_0 or date-based
zar_code	text	Short Base32 identifier
description	text	Human-readable description
owner_team	text	Responsible team
runtime_class	text	Execution reference (container/class)
status	text	active / deprecated

9.3.2. zar_cmi_alias

Optional alias mapping for human or legacy references.

Field	Description
alias	Short alternative name
cmi_name	FK to registry
scope	human / internal / legacy

9.3.3. zar_cmi_capabilities

Defines capabilities of each artifact.

Field	Description
cmi_name	FK to registry
capability	e.g. ocr, validation, merkle_verify
value	JSON configuration

9.4. Runtime Layer: Signal Provenance

Runtime data is modeled at the signal instance level, ensuring that each occurrence of a signal can follow a unique processing path.

9.4.1. signal_instances

Represents each individual signal occurrence (USO layer).

Field	Type	Description
signal_instance_id	PK	UUID / ULID
csi	text	Canonical Signal Identifier
sssr_id	text	Optional SSSR reference
uso_name	text	Semantic class
born_at	timestamptz	Creation timestamp
source_ref	jsonb	External reference (file, tx_id, etc.)

9.4.2. signal_lineage_events

Tracks all processing steps applied to a signal instance.

Field	Type	Description
event_id	PK	Unique event ID
signal_instance_id	FK	Linked signal instance
cmi_name	FK	Processing artifact
occurred_at	timestamptz	Timestamp
input_refs	jsonb	Upstream references
output_refs	jsonb	Derived outputs
params	jsonb	Execution parameters
metrics	jsonb	Runtime metrics
trust_context_id	text	Validation context

9.4.3. trust_context_snapshots

Stores validation and scoring context.

Field	Description
trust_context_id	PK
effective_weights	JSON
thresholds	JSON
modifiers	JSON
version_tag	text
captured_at	timestamp

9.5. Policy Layer: Routing & Governance

The policy layer defines which artifacts are allowed to process specific signal types.

9.5.1. zar_signal_policy

Defines allowed or restricted processing paths.

Field	Description
uso_name	Semantic class
csi_prefix	`<MODULE>.<COMPONENT_ID>.<KIND>`
allowed_cmi	Array of permitted artifacts
deny_cmi	Array of restricted artifacts
notes	Policy explanation
version	Policy version

9.6. End-to-End Data Flow

The complete data model supports a deterministic lifecycle:

9.7. Key Properties of the Model

Separation of Concerns

SSSR → semantics (CSI)
ZAR → artifacts (CMI)
Runtime → lineage (USO)

Instance-Level Lineage

Each signal instance has its own processing history.

Many-to-Many Relationships

Signals can pass through multiple artifacts
Artifacts can process multiple signals

Replayability

Full reconstruction of any signal path is possible.

Policy vs History Separation

Policies define allowed behavior
Runtime stores actual behavior

9.8. Example Query Patterns

Lineage Trace

lineage-trace.sqlGitHub ↗
SELECT e.occurred_at, e.cmi_name, e.params
FROM signal_instances i
JOIN signal_lineage_events e 
  ON e.signal_instance_id = i.signal_instance_id
WHERE i.source_ref->>'tx_id' = :tx_id
ORDER BY e.occurred_at;

Signal Discovery

signal-discovery.sqlGitHub ↗
SELECT signal_instance_id, csi, born_at
FROM signal_instances
WHERE uso_name = :uso_name;

Policy Violation Detection

policy-violation-detection.sqlGitHub ↗
SELECT DISTINCT e.cmi_name
FROM signal_instances i
JOIN signal_lineage_events e 
  ON e.signal_instance_id = i.signal_instance_id
LEFT JOIN zar_signal_policy p 
  ON p.uso_name = i.uso_name
WHERE i.uso_name = :uso
  AND (p.allowed_cmi IS NOT NULL 
       AND NOT (e.cmi_name = ANY(p.allowed_cmi)));

9.9. Strategic Outcome - Data Model

This data model enables ZAYAZ to operate as a fully traceable ESG computation infrastructure, where:

every signal instance is uniquely identifiable
every transformation is recorded
every artifact is governed
every decision is auditable

ZAR + SSSR + USO together form a deterministic ESG lineage system capable of supporting regulatory-grade assurance and AI-driven explainability at scale.

10 Graph Model & Knowledge Layer (Derived from ZAR)

While the relational ZAR data model provides the authoritative source of truth for artifacts, signal instances, lineage events, and policies, many of the most valuable ZAYAZ use cases require graph-native traversal rather than table-by-table querying.

Examples include:

tracing all upstream dependencies of a disclosure
finding which artifacts influenced a reported KPI
identifying all signals touched by a deprecated component
enabling ZARA to explain how a value was derived
federating lineage across multiple E-C-O entities and assurance domains

For these purposes, ZAYAZ should maintain a derived graph model on top of the relational core.

The relational model remains the canonical persistence layer.
The graph model acts as the traversal, reasoning, and explainability layer.

10.1. Why a Graph Layer Is Needed

Relational tables are ideal for:

integrity
transactional writes
schema governance
audit logging
deterministic persistence

Graphs are ideal for:

multi-hop lineage traversal
dependency analysis
impact analysis
explanation generation
policy path validation
cross-entity federation

A graph layer therefore gives ZAYAZ the ability to move from simple recordkeeping to navigable ESG intelligence.

10.2. Graph Design Principle

The graph model must follow one strict rule:

No graph node or edge may exist without a corresponding canonical source in the relational model.

This means:

CSI nodes derive from SSSR
CMI nodes derive from ZAR
signal instance nodes derive from signal_instances
lineage edges derive from signal_lineage_events
policy edges derive from zar_signal_policy
entity context derives from E-C-O references and runtime context

The graph is therefore:

derived
rebuildable
verifiable
non-authoritative for writes

10.3. Core Graph Entity Types

The graph model should contain the following primary node types.

A) SignalType

Represents a canonical semantic signal definition.

Derived from: SSSR Primary key: csi

Examples:

comp.PEF-ME.OUTPUT.CO2E.v1_0
vera.TG-CORE.OUTPUT.TRUST_SCORE.v1_0

B) Artifact

Represents a registered executable or structural artifact.

Derived from: zar_cmi_registry Primary key: cmi_name

Examples:

comp.PEF-ME.ENGINE.CORE.1_0_0
vera.TG-CORE.ENGINE.VALIDATOR.1_0_0

C) SignalInstance

Represents one runtime occurrence of a signal.

Derived from: signal_instances Primary key: signal_instance_id / uso_id

D) LineageEvent

Represents a processing event in which an artifact touched or transformed a signal instance.

Derived from: signal_lineage_events Primary key: event_id

E) Policy

Represents an allowed or denied routing/processing rule.

Derived from: zar_signal_policy

F) Entity

Represents a company, supplier, verifier, authority, or other context-bearing actor.

Derived from: E-C-O registry / runtime context Primary key: eco_number

G) DocumentedComponent

Represents the manual-defined component referenced via frontmatter ID.

Derived from: component documentation / MDX frontmatter Primary key: component_id

This node type is especially important because it connects the runtime system to the written specification and enables ZARA to explain components directly from the manual.

10.4. Core Relationship Types

The graph should model the following relationships.

Relationship	From	To	Meaning
INSTANCE_OF	SignalInstance	SignalType	This runtime instance is of this signal type
PRODUCED_BY	SignalInstance	Artifact	This signal instance was first produced by this artifact
TOUCHED_BY	SignalInstance	Artifact	This artifact processed or evaluated this instance
RECORDED_IN	SignalInstance	LineageEvent	This event belongs to this instance
USED_ARTIFACT	LineageEvent	Artifact	This event executed this artifact
INPUT_TO	SignalInstance	LineageEvent	This signal instance was an input to this event
OUTPUT_FROM	SignalInstance	LineageEvent	This signal instance was produced from this event
ALLOWED_FOR	Policy	Artifact	This policy permits the artifact
DENIED_FOR	Policy	Artifact	This policy denies the artifact
APPLIES_TO	Policy	SignalType	This policy applies to this signal type or prefix
CONTEXT_OF	SignalInstance	Entity	This signal belongs to this E-C-O context
DOCUMENTED_AS	Artifact	DocumentedComponent	This artifact corresponds to this documented component
DEFINES	DocumentedComponent	SignalType	This documented component defines or emits this signal type
DEPENDS_ON	Artifact	Artifact	This artifact depends on another artifact

10.5. Conceptual Graph Structure

10.6. Graph Projection from the Relational Model

The graph should be materialized through a controlled projection pipeline.

Source tables

Relational Source	Graph Output
sssr_signals / signal registry	SignalType nodes
zar_cmi_registry	Artifact nodes
signal_instances	SignalInstance nodes
signal_lineage_events	LineageEvent nodes and traversal edges
zar_signal_policy	Policy nodes and policy edges
component frontmatter / docs index	DocumentedComponent nodes
E-C-O registry / runtime context	Entity nodes

Projection rules

Every registered CSI becomes one SignalType node.
Every registered CMI becomes one Artifact node.
Every runtime signal instance becomes one SignalInstance node.
Every lineage event becomes one LineageEvent node.
Every SignalInstance is connected to exactly one SignalType.
Every producing artifact is linked using PRODUCED_BY.
Every processing event creates USED_ARTIFACT, INPUT_TO, and OUTPUT_FROM edges where applicable.
Every documented component becomes a DocumentedComponent node keyed by frontmatter ID.
Every artifact with a matching component_id is connected to the documented component via DOCUMENTED_AS.

10.7. Why the Documentation Node Matters

The DocumentedComponent node is a strategic advantage for ZAYAZ.

It allows the platform to connect:

runtime identity
artifact identity
semantic signal identity
architecture documentation
AI explanation context

This means ZARA can answer questions such as:

“What does PEF-ME do?”
“Why did this artifact generate CO2E?”
“Which documented module is responsible for this trust score?”
“What policies or assumptions are defined for this component?”

Without this link, documentation remains passive. With this link, the manual becomes an active reasoning surface.

10.8. Example Graph Traversal Use Cases

A) Explain a reported disclosure

Start from a reported KPI and traverse:

SignalType ← SignalInstance ← LineageEvent ← Artifact ← DocumentedComponent

This supports board-ready and verifier-ready explanation flows.

B) Impact analysis for artifact deprecation

Start from a deprecated Artifact node and traverse to:

all SignalType nodes it emits or transforms
all SignalInstance nodes affected historically
all Policy nodes that reference it
all disclosures downstream

This supports safe migration and governance.

C) Federated assurance traversal

Start from an Entity node and traverse across:

all signal instances in context
all artifacts that processed them
all trust and assurance events
all linked verifier outputs

This becomes a foundation for cross-ECO assurance and federation.

10.9. Suggested Graph Properties

SignalType

csi
module_code
component_id
kind
name
version
status
value_type
unit
framework_tags

Artifact

cmi_name
module_code
component_id
kind
name
version
zar_code
status
owner_team
runtime_class

SignalInstance

signal_instance_id
uso_id
csi
born_at
source_ref
trust_context_id

LineageEvent

event_id
occurred_at
params
metrics
trust_context_id

Policy

policy_key
uso_name
csi_prefix
version
notes

Entity

eco_number
entity_type
jurisdiction
status

DocumentedComponent

component_id
module_code
title
doc_path
status
owner_team

10.10. Recommended Graph Storage Strategy

ZAYAZ should treat the graph as a derived operational knowledge layer, not as a replacement for the relational core.

Recommended pattern:

Write canonical records to relational stores first.
Publish projection events when CSI, CMI, signal instance, lineage event, or policy records change.
Update the graph projection asynchronously.
Rebuild the graph from canonical stores when required.

This gives us:

integrity from the relational layer
speed and flexibility from the graph layer
safe rebuildability
deterministic explainability

10.11. Governance Rules for the Graph Layer

The graph layer must obey the following rules:

Derived Only No manual graph-only nodes or edges.
Source Traceability Every node and edge must store source references to relational origin.
Rebuildability The graph must be reconstructible from canonical tables.
Version Awareness CSI and CMI versions must be preserved as explicit properties.
Tenant / Entity Isolation Cross-entity traversals must obey RBAC, verifier permissions, and federation policies.
AI Explainability Compatibility Nodes and edges should remain interpretable by ZARA and ZAAM.

10.12. Strategic Outcome

By adding a graph layer on top of the ZAR relational model, ZAYAZ gains a new capability:

Not only storing ESG lineage, but navigating, explaining, and reasoning across it.

This turns ZAR from a registry into a live assurance intelligence fabric.

The relational model guarantees integrity. The graph model unlocks traversal. Together, they form the foundation for:

explainable ESG intelligence
cross-module dependency mapping
advanced traceability
federated assurance networks
documentation-aware AI guidance

GitHub Repo Request for Change (RFC)

1. The Identifier Trinity​

2. Canonical Signal Identifier (CSI)​

3. Canonical Managed Identifier (CMI)​

4. Universal Signal Ontology (USO) Identifier​

5. Identifier Relationships​

6. Canonical Signal Creation Flow​

7. Design Principles​

8. Strategic Outcome - CIA​

9. ZAR Data Model & Provenance Architecture​

9.1. Overview​

9.2. Architectural Separation​

9.3. Design-Time: Artifact Registry (ZAR)​

9.3.1. zar_cmi_registry​

9.3.2. zar_cmi_alias​

9.3.3. zar_cmi_capabilities​

9.4. Runtime Layer: Signal Provenance​

9.4.1. signal_instances​

9.4.2. signal_lineage_events​

9.4.3. trust_context_snapshots​

9.5. Policy Layer: Routing & Governance​

9.5.1. zar_signal_policy​

9.6. End-to-End Data Flow​

9.7. Key Properties of the Model​

9.8. Example Query Patterns​

9.9. Strategic Outcome - Data Model​

10 Graph Model & Knowledge Layer (Derived from ZAR)​

10.1. Why a Graph Layer Is Needed​

10.2. Graph Design Principle​

10.3. Core Graph Entity Types​

10.4. Core Relationship Types​

10.5. Conceptual Graph Structure​

10.6. Graph Projection from the Relational Model​

10.7. Why the Documentation Node Matters​

10.8. Example Graph Traversal Use Cases​

10.9. Suggested Graph Properties​

10.10. Recommended Graph Storage Strategy​

10.11. Governance Rules for the Graph Layer​

10.12. Strategic Outcome​