simz-dev
Statistical Inference Engine
1. Identity
<Identity meid="MEID_STAT01" />
Depends on module:
Background
Hypothesis testing can become a core trust, validation, and intelligence layer inside ZAYAZ if implemented correctly.
But it needs to be deeply integrated into the architecture (MICE, DICE, VTE, ZARA, SIS) — not treated as a generic statistics feature.
🧠 Where Hypothesis Testing Fits in ZAYAZ (Strategic View)
ZAYAZ is already:
- Signal-driven (SSSR)
- Validation-driven (DICE, DaVE, VTE)
- AI-governed and audit-logged (ALTD, AIGS) 
Hypothesis testing adds a formal statistical decision layer on top of this:
👉 From:
“This looks wrong / anomalous”
👉 To:
“We reject H₀ with 99% confidence: this supplier’s emissions trend is statistically inconsistent with historical baseline”
That shift is massive for CSRD-grade credibility.
🔧 Core Use Cases (High-Impact)
1. 📊 Data Validation & Anomaly Detection (DICE + VTE Enhancement)
Problem today:
- Outliers flagged heuristically
- Trust scores based on rules + AI
Add hypothesis testing:
- Null hypothesis (H₀): Data follows expected distribution
- Alternative (H₁): Data deviates significantly
Example:
- Electricity consumption spike
- Test: Z-score / t-test / Bayesian posterior deviation
Result:
- Instead of “flagged anomaly”
- You get:
- p-value
- confidence interval
- statistical justification
👉 This directly strengthens auditability + verifier trust
2. 🔍 Scope 3 Estimation Validation (SEM + Bayesian Engines)
You already:
- Use SEM extrapolation in ZARA 
- Run probabilistic modeling (Computation Hub)
Hypothesis testing can:
- Validate extrapolated values vs known distributions
- Compare supplier-reported vs model-estimated values
Example:
- H₀: Supplier emissions = expected industry mean
- H₁: Supplier deviates significantly
👉 Output:
- “Supplier emissions 27% higher than sector baseline (p < 0.01)”
This becomes:
- Benchmarking
- Risk scoring
- Materiality signal
3. 🏭 Supplier / Value Chain Benchmarking Engine
Integrate into:
- NORM + AGGR + RMAP micro engines 
Hypothesis-driven benchmarking:
- Compare:
- Company vs sector (NACE)
- Supplier vs peers
- Region vs global baseline
Example:
- H₀: Company is within normal emission range for NACE code
- Reject → triggers:
- Risk flag
- Governance escalation
- ZARA explanation
👉 This is next-gen ESG intelligence, not just reporting
4. 📈 Impact Measurement & ESG Strategy Validation
ZAYAZ tracks:
- Goals, KPIs, timelines 
Hypothesis testing enables:
Example:
- “Did our sustainability initiative reduce emissions?”
- H₀: No effect
- H₁: Reduction occurred
Use:
- A/B testing sustainability actions
- Policy effectiveness validation
- CAPEX justification
👉 This is financial + ESG convergence
5. 🧪 AI Governance & Model Validation (CRITICAL)
You already require:
- Model validation
- Drift detection
- audit logs 
Hypothesis testing should be embedded into:
AI Validation SOP:
- H₀: Model performance unchanged
- H₁: Model drift detected
Used for:
- Retraining triggers
- Bias detection
- Model degradation alerts
👉 This aligns perfectly with:
- EU AI Act
- CSRD AI traceability
6. 🎯 Materiality & Stakeholder Intelligence (SEEL)
Materiality today:
- Qualitative + scoring
Add:
- Statistical significance testing on stakeholder inputs
Example:
- H₀: Issue not materially significant
- H₁: Stakeholder concern statistically significant
👉 Outcome:
- Defensible double materiality decisions
🧱 Architectural Implementation (ZAYAZ-Native)
New Micro Engine Category
Add:
MEID_STATXX_v1 → Statistical Inference Engine
Core capabilities:
- Hypothesis testing (t-test, chi-square, Bayesian)
- Confidence intervals
- Distribution fitting
- Significance scoring
Integrates with:
- CALC (computation)
- VALI (validation)
- CFIL (confidence filter)
Integration Map
| Layer | Role |
|---|---|
| SSSR | Define expected distributions per signal |
| MICE (STAT Engine) | Run tests |
| DICE | Combine with rule validation |
| VTE | Convert to trust score impact |
| ZARA | Explain results in plain language |
| ALTD | Log hypothesis + result for audit |
Example Data Structure
{
"signal_id": "ghg_scope2_energy",
"test_type": "z_test",
"null_hypothesis": "within_expected_range",
"p_value": 0.003,
"confidence": 0.997,
"result": "reject_null",
"impact": "high_risk_flag",
"logged_at": "timestamp"
}
🚀 Strategic Advantage (This is the Big One)
If you implement this properly, ZAYAZ becomes:
❌ Not:
- ESG reporting tool
- ESG data platform
✅ But:
A statistically defensible ESG decision engine
That means:
- Verifiers trust it
- Regulators respect it
- CFOs use it
- Boards rely on it
⚠️ Important Caveats (Design Principles)
- “Precision Before Automation”
You already state this clearly 
👉 Hypothesis testing must:
- Use validated distributions (IPCC, Ecoinvent, etc.)
- Be explainable
- Never be “black box stats”
- Avoid Misuse
Not all ESG data fits classical stats:
- Small sample sizes
- Missing data
- Non-normal distributions
👉 Solution:
- Bayesian methods > classical in many cases
- UX is Critical (ZARA Layer)
Never show:
“p = 0.03”
Instead:
“This value is statistically unlikely compared to expected patterns (97% confidence).”
🔮 Next-Level Extensions (Where This Gets Powerful)
- Real-time anomaly detection across supply chains
- Predictive compliance breach detection
- Dynamic ESG risk pricing (insurance / finance)
- Automated auditor assistance (“statistical red flags”)
- Carbon market validation (credit integrity scoring)
🧭 Final Verdict
👉 Hypothesis testing is not just useful — it is a foundational upgrade to:
- Trust engine (VTE)
- Validation layer (DICE)
- AI governance (AIGS)
- Decision intelligence (ZARA)  
1. Engine Identity
- Engine ID: MEID_STAT01_v1
- Readable Name: Statistical Inference Engine
- Category Code: STAT
- Domain: Statistical validation, inference, significance testing, uncertainty quantification
- Primary Hub: Computation Hub
- Secondary Consumers: Input Hub, Reports & Insights Hub, Shared Intelligence Stack
- Lifecycle Status: Proposed
- Risk Class: Medium by default, High when outputs influence compliance decisions, verifier workflows, or AI-triggered escalations. This fits ZAYAZ’s AI governance model, where higher-impact AI/statistical modules require stronger oversight and logging. 
2. Strategic Purpose
MEID_STAT01_v1 provides a formal statistical decision layer for ZAYAZ.
It converts weak statements like:
- “this looks abnormal”
- “this trend may be suspicious ”
- “this estimate seems high”
into structured, defensible outputs like:
- “baseline consistency rejected at significance threshold 0.01”
- “reported value falls outside expected peer distribution”
- “post-intervention change is statistically meaningful”
- “model drift likely based on distributional shift”
This engine is not meant to replace rule validation, AI reasoning, or verifier judgment. It is meant to strengthen them with reproducible inference.
3. Placement in ZAYAZ Architecture
3.1 Role in MICE
ZAYAZ already defines Micro Engines as modular computation units invoked through structured routing and rule logic, with categories such as CALC, VALI, NORM, AGGR, RCAS, and CFIL. MEID_STAT01_v1 should be added as a first-class engine category in that same family. 
3.2 Architectural Dependencies
Consumes from:
- SSSR signal metadata
- NACE / sector / geography context
- historical observations
- peer benchmarks
- input-source trust metadata
- DICE validation outputs
- telemetry event streams
- AI model validation data
Feeds into:
- DICE
- VTE / trust logic
- ZARA explanations
- ZAAM inline guidance
- ALTD audit trails
- Reports Hub visualizations
- AI governance validation logs
3.3 Positioning
- DICE answers: “Is the data structurally valid?”
- STAT answers: “Is the data statistically credible?”
- VTE answers: “How should that affect trust?”
- ZARA answers: “What does it mean in human language?”
- ALTD answers: “Can we prove what happened later?”
That separation matches the modular and governed approach already defined across ZAYAZ.   
4. Core Objectives
MEID_STAT01_v1 shall support six primary objectives:
-
Outlier and anomaly significance testing For values, trends, ratios, and distributions.
-
Benchmark comparison testing Company vs peer group, site vs portfolio, supplier vs sector baseline, region vs region.
-
Pre/post intervention testing To assess whether a policy, capex action, training, or operational change had measurable effect.
-
Estimation plausibility testing Especially for SEM/extrapolated values, inferred Scope 3 values, and partially imputed datasets.
-
Distribution shift / drift detection For AI model governance, telemetry quality, or changing operational patterns.
-
Uncertainty packaging Confidence intervals, credible intervals, posterior probabilities, support strength, and test quality metadata.
5. Supported Use Cases
5.1 ESG Data Quality
- unusual energy use
- improbable waste intensity
- water-use spikes
- unsafe year-over-year jumps
- inconsistent supplier reporting
5.2 Scope 1, 2, 3 Emissions
- compare reported emissions to expected ranges by activity / NACE / geography
- test whether supplier-reported factors are materially inconsistent with known baselines
- test whether modeled estimates are statistically plausible
5.3 Materiality and Stakeholder Signals
- identify whether stakeholder issue concentration is statistically meaningful across segments
- detect whether issue salience differs by geography or stakeholder class
5.4 AI Governance
ZAYAZ’s governance material already requires structured validation, drift checks, supervised retraining, logging, and human oversight for more critical modules. STAT should become one of the standard inference backbones behind those checks. 
5.5 Verification Support
- produce machine-readable “statistical red flags”
- support verifiers with ranked review candidates
- distinguish “structurally valid but statistically suspicious” from “invalid”
6. Operating Principles
6.1 Precision Before Automation
The ZAYAZ manual explicitly prioritizes trust, explainability, and traceability over blind automation. STAT must inherit that principle directly. 
6.2 No Silent Statistical Decisions
Every material test must log:
- test type
- hypothesis definition
- sample context
- thresholds used
- assumptions
- result
- confidence / support level
- downstream effect
6.3 Test Appropriateness First
The engine must not force classical null-hypothesis testing where assumptions are weak. In many ESG contexts:
- samples are small
- data is skewed
- observations are missing
- sources are mixed quality
- peer groups are uneven
Therefore the engine must support both:
- classical tests
- Bayesian / resampling / robust alternatives
6.4 Explainability Layer Required
No raw p-values should be surfaced to ordinary users without contextual interpretation. ZARA/ZAAM should translate them into decision-grade language. This aligns with ZAYAZ’s agent architecture and explainability goals.  
7. Supported Test Families
7.1 Baseline Statistical Families
Difference tests
- one-sample t-test
- two-sample t-test
- Welch t-test
- paired t-test
- Mann–Whitney U
- Wilcoxon signed-rank
Proportion / categorical tests
- chi-square goodness-of-fit
- chi-square independence
- Fisher exact test
- z-test for proportions
Distribution / consistency tests
- Kolmogorov–Smirnov
- Anderson–Darling
- Shapiro-style normality checks only for internal suitability checks
- population stability / drift indices
Variance / dispersion tests
- Levene / Brown-Forsythe
- F-test only when justified
Time-series / change tests
- changepoint detection
- CUSUM-type detection
- drift detection on residuals
- rolling z-score / robust z-score
7.2 Bayesian / Robust Families
Preferred for many ESG applications:
- posterior probability of exceedance
- Bayesian mean comparison
- credible intervals
- prior-updated peer expectation models
- bootstrap confidence intervals
- permutation testing
- robust median-based deviation testing
7.3 Special ZAYAZ Modes
Plausibility mode For checking if a value is plausible under known distributions.
Comparative mode For comparing entities, suppliers, sites, or years.
Impact mode For testing if a change initiative had measurable effect.
Drift mode For model governance and telemetry monitoring.
Assurance mode For verifier-facing support packets.
8. Input Contract
8.1 Required Input Envelope
{
"engine_id": "MEID_STAT01_v1",
"mode": "plausibility",
"signal_id": "ghg_scope2_market_based",
"entity_id": "eco196123456789",
"reporting_period": "2025",
"comparison_scope": {
"peer_group_id": "nace_c25_eu_midcap",
"geography": "EU",
"sector_code": "C25"
},
"dataset_ref": "zar://dataset/....",
"hypothesis_template_id": "HT_STAT_PLAUS_003",
"significance_profile_id": "SIGPROF_SCOPE2_STANDARD",
"context": {
"source_mix": ["erp", "invoice", "manual"],
"sample_size": 38,
"input_trust_score": 0.84,
"estimation_flag": false
}
}
8.2 Input Sources
- structured tables
- registry-linked observations
- time-series
- peer benchmark extracts
- imputation outputs
- AI model validation logs
- verifier-reviewed samples
8.3 Required Metadata from SSSR
Each signal eligible for STAT must support additional metadata in SSSR:
stat_test_eligiblerecommended_test_familiesexpected_distribution_typeminimum_sample_policypeer_group_strategysignificance_profile_defaultescalation_policy_idexplainability_template_id
SSSR already functions as the smart metadata backbone for signals, so this is the correct place to store statistical eligibility and routing metadata. 
9. Hypothesis Template Registry
A dedicated registry should be created, for example:
stat_hypothesis_templates
Suggested fields:
template_idtemplate_namesignal_typetest_familynull_hypothesis_textalternative_hypothesis_textassumptionsfallback_test_familydefault_alphabayesian_supportedeffect_size_requiredhuman_explanation_templateverifier_explanation_templatestatusversion
Example
{
"template_id": "HT_STAT_PLAUS_003",
"signal_type": "energy_intensity",
"test_family": "robust_zscore_plus_bootstrap",
"null_hypothesis_text": "The observed value is consistent with expected peer-range behavior for the selected comparison scope.",
"alternative_hypothesis_text": "The observed value is not consistent with expected peer-range behavior.",
"default_alpha": 0.01,
"bayesian_supported": true,
"effect_size_required": true
}
10. Significance Profiles
Statistical thresholds should not be hardcoded globally. They should be controlled by policy profiles.
stat_significance_profiles
Fields:
profile_idnamealpha_defaulteffect_size_floorminimum_sample_sizemultiple_testing_policybayesian_probability_thresholdhigh_risk_overrideverifier_review_requiredhuman_approval_requiredstatus
Example profiles
SIGPROF_LOW_IMPACT_MONITORINGSIGPROF_SCOPE3_ESTIMATIONSIGPROF_AUDIT_ESCALATIONSIGPROF_AI_DRIFT_HIGH_RISK
This is consistent with ZAYAZ’s governance pattern of explicit thresholds, risk registers, and structured review gates. 
11. Output Contract
11.1 Standard Output
{
"engine_id": "MEID_STAT01_v1",
"run_id": "statrun-2026-04-04-000184",
"signal_id": "ghg_scope2_market_based",
"mode": "plausibility",
"test_family_used": "welch_t_plus_bootstrap",
"hypothesis": {
"null": "Observed value is consistent with peer baseline.",
"alternative": "Observed value differs materially from peer baseline."
},
"sample": {
"n_observed": 38,
"n_peer": 412,
"quality_flag": "acceptable"
},
"results": {
"decision": "reject_null",
"p_value": 0.004,
"effect_size": 0.71,
"confidence_interval": [0.19, 0.54],
"posterior_exceedance_probability": 0.973
},
"quality": {
"assumption_fit": "moderate",
"fallback_used": true,
"multiple_testing_adjusted": false
},
"impact": {
"trust_delta": -0.12,
"risk_flag": "high",
"escalation_triggered": true,
"recommended_action": "verifier_review"
},
"explainability": {
"user_message": "This value is statistically unusual compared with similar entities in the selected benchmark group.",
"verifier_message": "Observed value materially exceeds expected peer baseline under selected comparison profile."
},
"audit": {
"logged_to_altd": true,
"model_or_engine_version": "MEID_STAT01_v1",
"timestamp_utc": "2026-04-04T08:12:14Z"
}
}
11.2 Output Classes
supports_nullrejects_nullinconclusiveinsufficient_sampleassumption_failurefallback_applied
The engine must explicitly distinguish “no evidence of difference” from “not enough evidence.”
12. Routing and Invocation Logic
12.1 Trigger Sources
- DICE validation suspicion
- VTE trust reassessment
- ZARA prompted analysis
- ZAAM inline agent help
- telemetry anomaly event
- verifier workflow request
- scheduled periodic scans
- AI validation schedule
12.2 ZADIF / Rule Engine Invocation
The ZAYAZ architecture already uses routing logic, agent dispatching, and rule-driven activation. STAT should be routable through the same dispatch pattern.  
Example:
{
"dispatch_condition": {
"signal_type": "ghg_emission",
"trust_score_below": 0.88,
"reporting_context": "csrd",
"sample_size_min": 12
},
"target_engine": "MEID_STAT01_v1",
"mode": "plausibility"
}
13. Interaction with DICE and VTE
13.1 DICE
DICE remains first-pass structural validator. STAT should only run when:
- data passes minimum structural validation, or
- DICE explicitly requests a statistical diagnostic path
13.2 VTE
STAT should not directly set trust scores. It should output structured evidence that VTE uses.
Recommended contribution model:
- abnormality severity
- effect size
- assumption quality
- sample adequacy
- source-trust interaction
- repeat anomaly history
That keeps inference and trust logic separate, which is cleaner and more auditable.
14. Interaction with ZARA and ZAAM
14.1 ZARA
ZARA should use STAT outputs for narrative explanations such as:
- why a signal is suspicious
- why a benchmark comparison matters
- why additional evidence is required
ZARA is already positioned as the prompt-driven orchestration and reporting intelligence layer, so it should consume STAT outputs as explainable evidence objects. 
14.2 ZAAM
ZAAM agents can trigger or explain STAT in context:
- Form Assistant: “This value looks unusually high for your peer group.”
- Compliance Scout: “This trend could require additional disclosure attention.”
- Trust Guardian: “The statistical evidence reduced trust because the result diverges from prior validated patterns.”
That aligns with ZAAM’s scoped, trust-aware, role-aware agent model. 
15. Audit and Governance Requirements
All material STAT runs must write to ALTD-compatible audit records. ZAYAZ already emphasizes tamper detection, audit readiness, and governed AI lifecycle records.  
Required audit fields
run_idengine_idengine_versionsignal_identity_iddataset_reftest_family_usedhypothesis_template_idsignificance_profile_idthresholds_appliedresult_decisionhuman_review_requireddownstream_effectsinitiated_bytimestamp_utc
Governance flags
used_for_compliance_decisionused_for_ai_validationverifier_visiblehuman_approvedhigh_risk_path
16. Failure Modes and Fallbacks
16.1 Failure Classes
- insufficient sample
- missing benchmark group
- incompatible distribution assumptions
- poor data completeness
- excessive missingness
- conflicting source populations
- no approved significance profile
16.2 Fallback Policy
Fallback order:
- preferred test family
- robust non-parametric equivalent
- bootstrap/permutation
- descriptive-only mode
- escalate as “statistical inconclusive”
The engine must never fabricate certainty.
17. Security and Compliance
17.1 Privacy
STAT outputs must avoid exposing unnecessary peer-level details if peer groups are confidential.
17.2 Human Oversight
If STAT materially affects:
- compliance filings
- verifier routing
- AI retraining authorization
- public-facing trust displays
then review gates should apply according to ZAYAZ’s existing governance approach for higher-risk AI logic. 
17.3 EU AI Act Alignment
STAT itself is not a full AI model in all cases, but once combined with automated decision routing, trust scoring, or self-healing actions, it enters the governed AI/decision-support zone and should inherit AIGS/AI governance controls.
- Data Model Additions
Recommended tables:
stat_engine_registry
engine_idversionstatussupported_modessupported_test_familiesdefault_configmode_docs_url
stat_hypothesis_templates
As above.
stat_significance_profiles
As above.
stat_run_log
Execution records.
stat_benchmark_profiles
Defines peer group construction logic.
stat_signal_policy_map
Maps signal IDs to allowed tests, thresholds, peer strategies, and escalation policies.
19. API Draft
POST /api/mice/stat/run
Executes a statistical inference run.
POST /api/mice/stat/explain
Returns user / verifier / board phrasing for a completed run.
GET /api/mice/stat/run/{run_id}
Returns audit-grade run details.
POST /api/mice/stat/batch
Batch mode for portfolio scans, annual pre-checks, or verifier preparation.
POST /api/mice/stat/validate-profile
Validates significance profiles and test mappings before activation.
20. MVP Scope
Phase 1 MVP
Start with:
- plausibility mode
- comparative mode
- drift mode
- robust z-score
- Welch t-test
- Mann–Whitney U
- chi-square
- bootstrap CI
- descriptive fallback
- ALTD logging
- VTE evidence output
- ZARA explanation strings
Phase 2
Add:
- Bayesian comparison
- changepoint detection
- intervention impact mode
- multiple testing correction profiles
- verifier evidence packets
- dashboard components
Phase 3
Add:
- adaptive priors by NACE/geography
- federated peer baselines
- cross-entity anomaly propagation
- automated materiality significance models
APPENDIX A - ESRS Metrics Most Likely to Benefit from Hypothesis Testing
Highest-value metric families
A. Climate and energy
Best candidates because they are numeric, recurring, comparable, and often benchmarkable.
- Scope 1 emissions
- Scope 2 emissions
- Scope 3 category values
- electricity consumption
- fuel consumption
- energy intensity ratios
- renewable energy share
Why: high recurrence, strong peer comparability, good anomaly-detection value.
B. Water
- total withdrawal
- discharge volumes
- recycled/reused water share
- water intensity per unit output
Why: site-level trend testing and peer comparison are often valuable.
C. Waste and circularity
- hazardous waste
- non-hazardous waste
- waste diverted from disposal
- recycling rates
- recovery rates
- material efficiency ratios
Why: distributions are often skewed, so robust methods are useful.
D. Workforce / S metrics
- injury frequency
- absenteeism
- training completion
- diversity ratios
- turnover
Why: less suited to raw outlier testing than climate data, but useful for proportions and trend shifts.
E. Governance / process metrics
Often lower statistical value unless repeated over many entities or periods:
- policy coverage
- training completion
- incident counts
- whistleblower case patterns
Why: many are binary or low-frequency, so use categorical or proportion-based methods only.
Best fit categories for Phase 1
- energy and emissions
- water
- waste
- selected workforce ratios
Lower-value categories for initial rollout
- narrative disclosures
- one-time governance statements
- policy existence flags
- low-frequency event fields
These are better handled by rule logic, traceability, and document validation than by formal hypothesis tests.
APPENDIX B - Draft “Statistical Trust Score” Layer for VTE
Purpose
Convert statistical evidence into a bounded contribution to trust, without letting statistics dominate source quality, verification state, or audit provenance.
Principle
STAT informs trust; it does not own trust.
Proposed subscore
Create a VTE-compatible subscore:
statistical_trust_component = 0.00 to 1.00
Inputs
- test decision strength
- effect size magnitude
- confidence / posterior support
- sample adequacy
- assumption quality
- data completeness
- peer-group relevance
- repeat anomaly history
- whether value is estimated or directly observed
Example weighted model
STC =
0.20 * decision_strength
+ 0.15 * effect_size_quality
+ 0.15 * sample_adequacy
+ 0.10 * assumption_quality
+ 0.10 * completeness_quality
+ 0.10 * peer_group_fit
+ 0.10 * source_integrity_interaction
+ 0.10 * anomaly_history_modifier
Interpretation
0.85–1.00statistically well-supported0.65–0.84acceptable / monitor0.40–0.64weak statistical confidence0.00–0.39statistically problematic / escalate
Decision strength mapping
Example:
- supports_null strongly: 0.95
- inconclusive: 0.55
- rejects_null moderately: 0.35
- rejects_null strongly with large effect: 0.15
This sounds inverted, but the point is trust falls when the statistical evidence suggests abnormality relative to expectation.
Guardrails
- never reduce trust purely from one weak test
- require stronger effect for high-volatility metrics
- reduce penalty when source is verified and benchmark fit is weak
- increase penalty for repeated anomalies across periods
- cap maximum trust delta from STAT alone, for example ±0.15 per run
Recommended VTE integration
Final VTE score might combine:
- source provenance
- structural validation
- verifier status
- historical consistency
- statistical trust component
- AI-origin penalty/adjustment
That aligns well with ZAYAZ’s trust-centric architecture and explainable validation model.  
APPENDIX C - ZAYAZ Statistical Inference Layer
Technical Implementation Pack v0.1
C.1. Scope of this package
This package defines five implementation layers:
- SQL table schemas
- JSON schemas for API input/output
- SSSR field additions
- VTE integration logic
- Example statistical test designs for 5 ESRS-relevant metric families
This is an architectural draft, not a locked final schema. The goal is to make the first implementation:
- auditable
- modular
- backward-compatible
- explainable
- safe to deploy in stages
C.2. Core architecture overview
C.2.1. Proposed runtime flow
FOGE / API / Import / Telemetry / Verifier Request
↓
DICE
↓
Rule Engine / ZADIF
↓
MEID_STAT01_v1
↓
Statistical Result Object
↓
VTE Trust Logic
↓
ZARA / ZAAM Explanation Layer
↓
ALTD / Audit + Reports Hub
C.2.2. Design principle
STAT should behave like a governed evidence engine, not a black-box scoring engine.
It should:
- evaluate statistical consistency
- package uncertainty explicitly
- return bounded evidence objects
- avoid direct final decisions where governance requires human review
That matches the broader ZAYAZ governance and trust philosophy already described in the manuals.  
C.3. SQL schema pack
Below is a practical relational design for Postgres-style deployment.
C.3.1. stat_engine_registry
Purpose: register statistical engines and supported modes.
CREATE TABLE stat_engine_registry (
engine_id VARCHAR(50) PRIMARY KEY,
readable_name VARCHAR(255) NOT NULL,
version VARCHAR(30) NOT NULL,
status VARCHAR(30) NOT NULL CHECK (status IN ('active', 'experimental', 'deprecated', 'archived')),
supported_modes JSONB NOT NULL,
supported_test_families JSONB NOT NULL,
default_config JSONB,
mode_docs_url TEXT,
created_at TIMESTAMP NOT NULL DEFAULT NOW(),
updated_at TIMESTAMP NOT NULL DEFAULT NOW()
);
Example row
{
"engine_id": "MEID_STAT01_v1",
"readable_name": "Statistical Inference Engine",
"version": "1.0.0",
"status": "experimental",
"supported_modes": ["plausibility", "comparative", "drift", "impact"],
"supported_test_families": ["welch_t", "mann_whitney_u", "chi_square", "bootstrap_ci", "robust_zscore"]
}
C.3.2. stat_hypothesis_templates
Purpose: standardized test logic by metric family / signal type.
CREATE TABLE stat_hypothesis_templates (
template_id VARCHAR(60) PRIMARY KEY,
template_name VARCHAR(255) NOT NULL,
signal_type VARCHAR(100) NOT NULL,
metric_family VARCHAR(100),
default_test_family VARCHAR(80) NOT NULL,
fallback_test_family VARCHAR(80),
null_hypothesis_text TEXT NOT NULL,
alternative_hypothesis_text TEXT NOT NULL,
assumptions JSONB,
default_alpha NUMERIC(6,5) NOT NULL DEFAULT 0.05,
default_effect_size_floor NUMERIC(8,4),
bayesian_supported BOOLEAN NOT NULL DEFAULT FALSE,
bootstrap_supported BOOLEAN NOT NULL DEFAULT TRUE,
effect_size_required BOOLEAN NOT NULL DEFAULT TRUE,
explainability_template_id VARCHAR(60),
verifier_template_id VARCHAR(60),
status VARCHAR(30) NOT NULL CHECK (status IN ('active', 'deprecated', 'draft', 'archived')),
version VARCHAR(20) NOT NULL,
created_at TIMESTAMP NOT NULL DEFAULT NOW(),
updated_at TIMESTAMP NOT NULL DEFAULT NOW()
);
C.3.3. stat_significance_profiles
Purpose: policy-based thresholds rather than hardcoded alpha values.
CREATE TABLE stat_significance_profiles (
profile_id VARCHAR(60) PRIMARY KEY,
profile_name VARCHAR(255) NOT NULL,
alpha_default NUMERIC(6,5) NOT NULL,
minimum_sample_size INTEGER,
effect_size_floor NUMERIC(8,4),
bayesian_probability_threshold NUMERIC(6,5),
multiple_testing_policy VARCHAR(50),
confidence_interval_level NUMERIC(6,5) DEFAULT 0.95,
assumption_failure_policy VARCHAR(50) NOT NULL DEFAULT 'fallback',
inconclusive_policy VARCHAR(50) NOT NULL DEFAULT 'no_penalty',
verifier_review_required BOOLEAN NOT NULL DEFAULT FALSE,
human_approval_required BOOLEAN NOT NULL DEFAULT FALSE,
high_risk_override JSONB,
status VARCHAR(30) NOT NULL CHECK (status IN ('active', 'deprecated', 'draft', 'archived')),
created_at TIMESTAMP NOT NULL DEFAULT NOW(),
updated_at TIMESTAMP NOT NULL DEFAULT NOW()
);
C.3.4. stat_benchmark_profiles
Purpose: define peer groups and benchmark construction logic.
CREATE TABLE stat_benchmark_profiles (
benchmark_profile_id VARCHAR(60) PRIMARY KEY,
benchmark_name VARCHAR(255) NOT NULL,
scope_type VARCHAR(50) NOT NULL CHECK (scope_type IN ('sector', 'geography', 'size_band', 'client_portfolio', 'custom')),
nace_codes JSONB,
geographies JSONB,
size_bands JSONB,
reporting_frameworks JSONB,
signal_filters JSONB,
inclusion_rules JSONB,
exclusion_rules JSONB,
minimum_peer_count INTEGER NOT NULL DEFAULT 20,
freshness_days INTEGER,
confidentiality_policy VARCHAR(50) NOT NULL DEFAULT 'aggregate_only',
status VARCHAR(30) NOT NULL CHECK (status IN ('active', 'draft', 'deprecated', 'archived')),
created_at TIMESTAMP NOT NULL DEFAULT NOW(),
updated_at TIMESTAMP NOT NULL DEFAULT NOW()
);
C.3.5. stat_signal_policy_map
Purpose: map individual signals to statistical policies.
CREATE TABLE stat_signal_policy_map (
signal_id VARCHAR(120) PRIMARY KEY,
stat_test_eligible BOOLEAN NOT NULL DEFAULT FALSE,
preferred_mode VARCHAR(50),
preferred_test_family VARCHAR(80),
fallback_test_family VARCHAR(80),
hypothesis_template_id VARCHAR(60),
significance_profile_id VARCHAR(60),
benchmark_profile_id VARCHAR(60),
expected_distribution_type VARCHAR(50),
minimum_sample_size INTEGER,
requires_effect_size BOOLEAN NOT NULL DEFAULT TRUE,
multiple_testing_group VARCHAR(100),
escalation_policy_id VARCHAR(60),
explainability_template_id VARCHAR(60),
verifier_packet_required BOOLEAN NOT NULL DEFAULT FALSE,
created_at TIMESTAMP NOT NULL DEFAULT NOW(),
updated_at TIMESTAMP NOT NULL DEFAULT NOW(),
CONSTRAINT fk_stat_hypothesis_template
FOREIGN KEY (hypothesis_template_id) REFERENCES stat_hypothesis_templates(template_id),
CONSTRAINT fk_stat_significance_profile
FOREIGN KEY (significance_profile_id) REFERENCES stat_significance_profiles(profile_id),
CONSTRAINT fk_stat_benchmark_profile
FOREIGN KEY (benchmark_profile_id) REFERENCES stat_benchmark_profiles(benchmark_profile_id)
);
C.3.6. stat_run_log
Purpose: immutable log of each statistical execution.
CREATE TABLE stat_run_log (
run_id VARCHAR(80) PRIMARY KEY,
engine_id VARCHAR(50) NOT NULL,
engine_version VARCHAR(30) NOT NULL,
signal_id VARCHAR(120) NOT NULL,
entity_id VARCHAR(80),
reporting_period VARCHAR(40),
mode VARCHAR(50) NOT NULL,
initiated_by VARCHAR(80) NOT NULL,
dataset_ref TEXT,
benchmark_profile_id VARCHAR(60),
hypothesis_template_id VARCHAR(60),
significance_profile_id VARCHAR(60),
test_family_requested VARCHAR(80),
test_family_used VARCHAR(80),
fallback_used BOOLEAN NOT NULL DEFAULT FALSE,
null_hypothesis_text TEXT,
alternative_hypothesis_text TEXT,
sample_metadata JSONB,
assumptions_metadata JSONB,
results_payload JSONB NOT NULL,
decision_class VARCHAR(50) NOT NULL,
trust_delta NUMERIC(8,4),
escalation_triggered BOOLEAN NOT NULL DEFAULT FALSE,
escalation_reason TEXT,
human_review_required BOOLEAN NOT NULL DEFAULT FALSE,
human_review_status VARCHAR(40),
altd_logged BOOLEAN NOT NULL DEFAULT FALSE,
created_at TIMESTAMP NOT NULL DEFAULT NOW()
);
C.3.7. stat_explainability_templates
Purpose: human-readable output templates for ZARA / ZAAM / verifiers.
CREATE TABLE stat_explainability_templates (
template_id VARCHAR(60) PRIMARY KEY,
audience_type VARCHAR(40) NOT NULL CHECK (audience_type IN ('user', 'verifier', 'board', 'internal_ops', 'agent')),
language_code VARCHAR(10) NOT NULL DEFAULT 'en',
template_text TEXT NOT NULL,
severity_mapping JSONB,
variable_schema JSONB,
status VARCHAR(30) NOT NULL CHECK (status IN ('active', 'draft', 'deprecated', 'archived')),
version VARCHAR(20) NOT NULL,
created_at TIMESTAMP NOT NULL DEFAULT NOW(),
updated_at TIMESTAMP NOT NULL DEFAULT NOW()
);
C.3.8. stat_test_catalog
Purpose: controlled list of allowed methods.
CREATE TABLE stat_test_catalog (
test_family_id VARCHAR(80) PRIMARY KEY,
readable_name VARCHAR(255) NOT NULL,
class_type VARCHAR(50) NOT NULL,
supports_small_samples BOOLEAN NOT NULL DEFAULT FALSE,
supports_non_normal BOOLEAN NOT NULL DEFAULT FALSE,
supports_missingness_robustness BOOLEAN NOT NULL DEFAULT FALSE,
supports_effect_size BOOLEAN NOT NULL DEFAULT TRUE,
supports_bootstrap BOOLEAN NOT NULL DEFAULT FALSE,
supports_bayesian BOOLEAN NOT NULL DEFAULT FALSE,
default_for_modes JSONB,
status VARCHAR(30) NOT NULL CHECK (status IN ('active', 'draft', 'deprecated', 'archived'))
);
C.4. JSON schema pack
Below is a practical API contract design.
C.4.1. Run request schema
{
"type": "object",
"required": ["engine_id", "mode", "signal_id", "initiated_by"],
"properties": {
"engine_id": { "type": "string", "enum": ["MEID_STAT01_v1"] },
"mode": {
"type": "string",
"enum": ["plausibility", "comparative", "drift", "impact", "assurance"]
},
"signal_id": { "type": "string" },
"entity_id": { "type": "string" },
"reporting_period": { "type": "string" },
"initiated_by": { "type": "string" },
"dataset_ref": { "type": "string" },
"benchmark_profile_id": { "type": "string" },
"hypothesis_template_id": { "type": "string" },
"significance_profile_id": { "type": "string" },
"test_family_requested": { "type": "string" },
"context": {
"type": "object",
"properties": {
"source_mix": { "type": "array", "items": { "type": "string" } },
"input_trust_score": { "type": "number", "minimum": 0, "maximum": 1 },
"estimation_flag": { "type": "boolean" },
"peer_group_override": { "type": "object" },
"sample_metadata": { "type": "object" }
}
}
}
}
C.4.2. Run response schema
{
"type": "object",
"required": [
"run_id",
"engine_id",
"signal_id",
"mode",
"decision_class",
"results",
"impact",
"audit"
],
"properties": {
"run_id": { "type": "string" },
"engine_id": { "type": "string" },
"engine_version": { "type": "string" },
"signal_id": { "type": "string" },
"entity_id": { "type": "string" },
"mode": { "type": "string" },
"test_family_used": { "type": "string" },
"fallback_used": { "type": "boolean" },
"decision_class": {
"type": "string",
"enum": [
"supports_null",
"rejects_null",
"inconclusive",
"insufficient_sample",
"assumption_failure",
"fallback_applied"
]
},
"results": {
"type": "object",
"properties": {
"p_value": { "type": ["number", "null"] },
"effect_size": { "type": ["number", "null"] },
"confidence_interval": {
"type": ["array", "null"],
"items": { "type": "number" },
"minItems": 2,
"maxItems": 2
},
"posterior_exceedance_probability": { "type": ["number", "null"] },
"test_statistic": { "type": ["number", "null"] },
"assumption_fit": { "type": "string" }
}
},
"impact": {
"type": "object",
"properties": {
"trust_delta": { "type": ["number", "null"] },
"risk_flag": { "type": "string" },
"escalation_triggered": { "type": "boolean" },
"recommended_action": { "type": "string" }
}
},
"explainability": {
"type": "object",
"properties": {
"user_message": { "type": "string" },
"verifier_message": { "type": "string" },
"board_message": { "type": "string" }
}
},
"audit": {
"type": "object",
"properties": {
"logged_to_altd": { "type": "boolean" },
"timestamp_utc": { "type": "string" },
"human_review_required": { "type": "boolean" }
}
}
}
}
C.4.3. Batch request schema
{
"type": "object",
"required": ["engine_id", "mode", "initiated_by", "items"],
"properties": {
"engine_id": { "type": "string" },
"mode": { "type": "string" },
"initiated_by": { "type": "string" },
"items": {
"type": "array",
"items": {
"type": "object",
"required": ["signal_id"],
"properties": {
"signal_id": { "type": "string" },
"entity_id": { "type": "string" },
"reporting_period": { "type": "string" },
"dataset_ref": { "type": "string" }
}
}
}
}
}
C.5. SSSR field additions
Because SSSR is the correct place for signal-level intelligence, routing metadata, and structured lookup behavior in ZAYAZ, the statistical layer should be attached there rather than scattered across engine configs. 
C.5.1. New SSSR fields for statistical readiness
Add these fields to the signal metadata layer:
{
"stat_test_eligible": true,
"stat_priority_level": "high",
"stat_metric_family": "climate_energy",
"preferred_stat_mode": "plausibility",
"recommended_test_families": ["welch_t", "bootstrap_ci", "robust_zscore"],
"expected_distribution_type": "right_skewed",
"minimum_sample_policy": {
"preferred_min_n": 20,
"absolute_min_n": 8
},
"benchmark_profile_id": "BENCH_NACE_C25_EU",
"hypothesis_template_id": "HT_SCOPE2_PLAUS_001",
"significance_profile_id": "SIGPROF_SCOPE2_STANDARD",
"effect_size_required": true,
"multiple_testing_group": "esrs_e1_energy",
"stat_explainability_template_id": "STAT_USER_GENERIC_001",
"verifier_packet_required": true,
"stat_retest_cooldown_days": 30,
"stat_escalation_policy_id": "ESC_STAT_HIGH_SCOPE2"
}
C.5.2. Strong recommendation
Do not add just one boolean like supports_statistics. That would be too weak. Use a structured object so:
- routing stays deterministic
- governance stays inspectable
- benchmark strategies stay versioned
C.6. VTE integration logic
C.6.1. Principle
STAT should contribute a bounded trust evidence component into VTE, not replace provenance, document validation, or verifier approval.
C.6.2. Proposed VTE composition
Final_Trust_Score =
0.30 * provenance_component
+ 0.20 * structural_validation_component
+ 0.15 * verifier_component
+ 0.15 * historical_consistency_component
+ 0.10 * statistical_trust_component
+ 0.10 * ai_origin_adjustment_component
This is just a starting balance. For ZAYAZ, we should keep STAT at 10% to 15% max in early phases.
C.6.3. Statistical trust component formula
STC =
0.20 * decision_strength
+ 0.15 * effect_size_quality
+ 0.15 * sample_adequacy
+ 0.10 * assumption_quality
+ 0.10 * completeness_quality
+ 0.10 * peer_group_fit
+ 0.10 * source_integrity_interaction
+ 0.10 * anomaly_history_modifier
Normalize to 0.00–1.00.
C.6.4. Decision strength mapping
{
"supports_null_strong": 0.95,
"supports_null_moderate": 0.82,
"inconclusive": 0.58,
"rejects_null_moderate": 0.35,
"rejects_null_strong": 0.15,
"insufficient_sample": 0.50,
"assumption_failure": 0.52
}
C.6.5. Trust delta rule
STAT should output both:
statistical_trust_componentsuggested_trust_delta
Suggested rule:
suggested_trust_delta = (STC - 0.70) * 0.20
Then cap:
- minimum delta:
-0.15 - maximum delta:
+0.08
This prevents statistics from overpowering the total trust score.
C.6.6. Escalation thresholds
Example:
trust_delta <= -0.10 and effect_size >= 0.6→ verifier review- repeated anomaly 3 periods in a row → high-risk escalation
- inconclusive + verified source → no penalty
- assumption failure + missing benchmark → route to descriptive-only mode
C.6.7. Pseudocode
def compute_statistical_trust_component(run):
decision_strength = map_decision_strength(run.decision_class, run.results)
effect_size_quality = map_effect_size(run.results.get("effect_size"))
sample_adequacy = map_sample_quality(run.sample_metadata)
assumption_quality = map_assumption_fit(run.results.get("assumption_fit"))
completeness_quality = map_completeness(run.sample_metadata)
peer_group_fit = map_peer_group_fit(run.sample_metadata)
source_integrity_interaction = map_source_integrity(run.context)
anomaly_history_modifier = map_history(run.entity_id, run.signal_id)
stc = (
0.20 * decision_strength +
0.15 * effect_size_quality +
0.15 * sample_adequacy +
0.10 * assumption_quality +
0.10 * completeness_quality +
0.10 * peer_group_fit +
0.10 * source_integrity_interaction +
0.10 * anomaly_history_modifier
)
return round(min(max(stc, 0.0), 1.0), 4)
C.7. Example implementation logic for 5 ESRS-relevant metric families
These are not legal ESRS interpretations. They are implementation archetypes for statistical support inside ZAYAZ.
C.7.1. Family A: Scope 2 electricity / energy emissions
Typical signals
- electricity consumption
- location-based Scope 2
- market-based Scope 2
- energy intensity ratio
Best modes
- plausibility
- comparative
- drift
Preferred tests
- Welch t-test
- robust z-score
- bootstrap confidence interval
Hypothesis example
- H0: entity value is consistent with sector/geography peer baseline
- H1: entity value differs materially from peer baseline
Signal policy example
{
"signal_id": "ghg_scope2_market_based",
"preferred_stat_mode": "plausibility",
"recommended_test_families": ["welch_t", "bootstrap_ci", "robust_zscore"],
"expected_distribution_type": "right_skewed",
"minimum_sample_policy": {
"preferred_min_n": 20,
"absolute_min_n": 8
},
"effect_size_required": true,
"verifier_packet_required": true
}
Notes
This is one of the strongest early candidates because it is recurring, numeric, and highly comparable.
C.7.2. Family B: Scope 3 business travel / upstream transport
Typical signals
- flight emissions
- travel activity values
- freight emissions
- transport intensity
Best modes
- plausibility
- comparative
- impact
Preferred tests
- Mann–Whitney U
- bootstrap CI
- changepoint detection for trends
Hypothesis example
- H0: travel-related emission intensity is unchanged from prior operating profile
- H1: a meaningful shift occurred
Special caution
These metrics can be structurally volatile. Therefore:
- stronger effect-size thresholds
- more tolerant anomaly penalties
- more emphasis on trend context than one-off outliers
C.7.3. Family C: Water withdrawal / discharge
Typical signals
- total water withdrawn
- recycled water share
- water intensity per production unit
- discharge volume
Best modes
- plausibility
- comparative
- impact
Preferred tests
- Welch t-test
- paired test for pre/post interventions
- bootstrap CI
Hypothesis example
- H0: water intensity after intervention is unchanged
- H1: water intensity decreased meaningfully after intervention
High-value use
Very good for demonstrating measurable change after capex, policy, or operational changes.
C.7.4. Family D: Waste and circularity
Typical signals
- hazardous waste
- non-hazardous waste
- diverted from disposal
- recycled fraction
- circular material use ratios
Best modes
- plausibility
- comparative
- drift
Preferred tests
- Mann–Whitney U
- chi-square for disposal category proportions
- bootstrap CI
Hypothesis example
- H0: waste diversion pattern is consistent with prior validated pattern
- H1: waste diversion pattern differs materially
Notes
This family is often skewed and operationally messy. Robust and non-parametric methods should dominate.
C.7.5. Family E: Workforce safety / social ratios
Typical signals
- injury rate
- lost-time incident rate
- turnover
- diversity proportions
- training completion ratios
Best modes
- comparative
- drift
- impact
Preferred tests
- z-test for proportions
- Fisher exact test
- chi-square
- change-point or rolling drift methods
Hypothesis example
- H0: injury rate proportion is consistent with prior baseline
- H1: injury rate changed materially
Notes
For social metrics, category and rate tests matter more than continuous-value comparisons.
C.8. Example seeded records
C.8.1. Example significance profile
{
"profile_id": "SIGPROF_SCOPE2_STANDARD",
"profile_name": "Scope 2 Standard Statistical Review",
"alpha_default": 0.01,
"minimum_sample_size": 12,
"effect_size_floor": 0.40,
"bayesian_probability_threshold": 0.95,
"multiple_testing_policy": "benjamini_hochberg",
"confidence_interval_level": 0.95,
"assumption_failure_policy": "fallback",
"inconclusive_policy": "no_penalty",
"verifier_review_required": false,
"human_approval_required": false
}
C.8.2. Example hypothesis template
{
"template_id": "HT_SCOPE2_PLAUS_001",
"template_name": "Scope 2 Peer Plausibility Check",
"signal_type": "ghg_emission",
"metric_family": "climate_energy",
"default_test_family": "welch_t",
"fallback_test_family": "bootstrap_ci",
"null_hypothesis_text": "The reported Scope 2 value is consistent with the expected peer baseline for comparable entities.",
"alternative_hypothesis_text": "The reported Scope 2 value differs materially from the expected peer baseline for comparable entities.",
"default_alpha": 0.01,
"default_effect_size_floor": 0.40,
"bayesian_supported": true,
"bootstrap_supported": true,
"effect_size_required": true
}
C.9. API endpoint draft
POST /api/mice/stat/run
Example request
{
"engine_id": "MEID_STAT01_v1",
"mode": "plausibility",
"signal_id": "ghg_scope2_market_based",
"entity_id": "eco196123456789",
"reporting_period": "2025",
"initiated_by": "dice_auto_rule",
"dataset_ref": "zar://dataset/scope2/2025/entity/eco196123456789",
"benchmark_profile_id": "BENCH_NACE_C25_EU",
"hypothesis_template_id": "HT_SCOPE2_PLAUS_001",
"significance_profile_id": "SIGPROF_SCOPE2_STANDARD",
"context": {
"source_mix": ["erp", "invoice"],
"input_trust_score": 0.86,
"estimation_flag": false
}
}
Example response
{
"run_id": "statrun-000001",
"engine_id": "MEID_STAT01_v1",
"engine_version": "1.0.0",
"signal_id": "ghg_scope2_market_based",
"entity_id": "eco196123456789",
"mode": "plausibility",
"test_family_used": "welch_t",
"fallback_used": false,
"decision_class": "rejects_null",
"results": {
"p_value": 0.0041,
"effect_size": 0.68,
"confidence_interval": [0.21, 0.49],
"posterior_exceedance_probability": 0.973,
"test_statistic": 2.91,
"assumption_fit": "moderate"
},
"impact": {
"trust_delta": -0.11,
"risk_flag": "high",
"escalation_triggered": true,
"recommended_action": "verifier_review"
},
"explainability": {
"user_message": "This value appears statistically unusual compared with similar entities in the selected benchmark.",
"verifier_message": "Observed Scope 2 value materially exceeds peer baseline under the active profile.",
"board_message": "A statistically significant deviation has been detected and should be reviewed before final disclosure."
},
"audit": {
"logged_to_altd": true,
"timestamp_utc": "2026-04-04T09:12:14Z",
"human_review_required": false
}
}
C.10. Governance controls
Because ZAYAZ already has a formal AI governance charter, validation SOP, retraining log model, and risk register concept, the STAT engine should be onboarded through that same discipline rather than introduced as an informal utility. 
Required controls for launch
- register MEID_STAT01_v1 in engine registry
- assign risk level
- define validation frequency
- define fallback and failure policies
- require ALTD logging for material runs
- define human-review thresholds
- define statistical method approval list
- prohibit silent threshold changes
Recommended initial risk classification
- Medium for passive advisory/statistical evidence
- High if directly driving trust score changes for compliance-critical disclosures
- High if used in automated verifier escalation or AI self-healing actions
C.11. Recommended rollout sequence
Phase 0
Schema-only
- create tables
- seed 3 test families
- seed 2 significance profiles
- add SSSR metadata fields
- no user-facing outputs yet
Phase 1
Passive evidence mode
- run STAT after DICE for selected climate/energy signals
- write outputs to ALTD
- do not alter visible trust score yet
- expose only to internal ops and verifier sandbox
Phase 2
Bounded VTE integration
- allow limited trust delta
- enable ZARA explanations
- enable internal dashboard flags
Phase 3
Verifier-facing support
- assurance packets
- review queues
- batch scans before report export
Phase 4
Advanced modes
- Bayesian support
- intervention-effect mode
- drift support for AI governance and telemetry
C.12. Recommended first seeded signal set
I would begin with 12 to 20 signals max.
Best first set:
- Scope 2 market-based emissions
- Scope 2 location-based emissions
- electricity consumption
- fuel consumption
- energy intensity
- water withdrawal
- water intensity
- hazardous waste
- non-hazardous waste
- waste diversion rate
- LTIR or equivalent injury rate
- employee turnover ratio
This is enough to validate the architecture without creating test sprawl.
C.13. Final architecture recommendation
The cleanest long-term pattern is this:
- SSSR owns eligibility and mapping
- STAT owns inference
- VTE owns trust interpretation
- ZARA/ZAAM own explanation
- ALTD owns evidence trail
- AI governance owns approval boundaries
That keeps ZAYAZ modular, future-proof, and defensible under audit and regulatory scrutiny. It also fits the platform’s existing decomposition into registries, agents, trust layers, micro-engines, and governed workflows.