ZDRS
ZAYAZ Data Registry System
1. Purpose
The ZAYAZ Data Registry System (ZDRS) defines how structured ESG data flows from human-authored sources into validated, versioned, production-grade datasets.
It integrates four core subsystems:
- Registry Pipeline (data transformation)
- ZARA Validation Engine (data quality enforcement)
- Dataset Versioning System (immutability & traceability)
- Promotion System (environment governance)
Excel is the editable source Schema is the contract Dataset is the machine truth ZARA is the enforcement layer Promotion is the governance layer
2. Storage Architecture (S3)
/dev/
/excel/
/datasets/
/schemas/
/exports/
/models/
/staging/
/datasets/
/schemas/
/exports/
/models/
/prod/
/datasets/
/schemas/
/exports/
/models/
Rules
- Excel ONLY allowed in
/dev /prodcontains only normalized, validated data- All runtime systems read from
/prod
3. Registry Object Model
3.1 Canonical ID
compute_method_registry
emission_factor_ipcc
altd_event
3.2 Dataset (canonical runtime object)
{
"registry_id": "compute_method_registry",
"version": "1.0.0",
"records": [],
"lineage": {}
}
4. Registry Pipeline v1
4.1 Overview
Excel (/dev)
→ Schema validation (ZARA)
→ Normalization
→ Dataset generation
→ Versioning
→ Promotion
→ Production datasets (/prod)
4.2 Pipeline Stages
Stage 1 — Ingestion
Input:
/dev/excel/<registry_id>.xlsx
Stage 2 — ZARA Validation
ZARA validates:
- required fields
- data types
- enums
- relationships
- duplicates
Output:
validation_report.json
Stage 3 — Normalization
Transforms Excel into:
/dev/datasets/<registry_id>.json
Features:
- clean keys
- consistent types
- null removal
Stage 4 — Metadata Injection
Adds:
- lineage
- timestamps
- environment
Stage 5 — Versioning
Creates immutable dataset:
/dev/datasets/<registry_id>.v1.json
Stage 6 — Promotion
dev → staging → prod
5. ZARA Validation Engine v1
5.1 Role
ZARA is the governance and validation layer ensuring all datasets meet ZAYAZ quality standards.
5.2 Validation Types
Structural Validation
- schema compliance
- required fields
Semantic Validation
- valid enum values
- correct units
Relational Validation
- foreign key references
- cross-registry consistency
Integrity Validation
- duplicate IDs
- null violations
5.3 Validation Output
{
"registry_id": "compute_method_registry",
"status": "pass",
"errors": [],
"warnings": []
}
5.4 Enforcement Rules
| Environment | Rule |
|---|---|
| dev | warnings allowed |
| staging | no errors |
| prod | strict pass only |
6. Dataset Versioning System v1
6.1 Principles
- immutable datasets
- versioned artifacts
- reproducibility guaranteed
6.2 File Strategy
/dev/datasets/compute_method_registry.v1.json
/dev/datasets/compute_method_registry.v2.json
6.3 Latest Pointer
/dev/datasets/compute_method_registry.json
Points to latest version.
6.4 Version Metadata
{
"version": "1.0.0",
"created_at": "2026-03-22",
"source_hash": "abc123",
"schema_version": "1.0.0"
}
7. Dataset Promotion System v1
7.1 Flow
dev → staging → prod
7.2 Promotion Rules
Dev → Staging
- validation passed
- dataset generated
Staging → Prod
- ZARA strict validation passed
- approved (manual or automated)
- version locked
7.3 Promotion Example
/dev/datasets/cmr.v1.json
→ /staging/datasets/cmr.v1.json
→ /prod/datasets/cmr.v1.json
7.4 Production Rules
- immutable datasets
- no overwrite
- no Excel
- audit-ready
8. Registry Catalog
/dev/datasets/registry_catalog.json
9. Integration
ZARA
- validation engine
Computation Hub
- consumes
/prod/datasets/
Reports Hub
- builds outputs from datasets
Search
- indexes datasets + schemas
10. Architectural Summary
ZDRS v1 establishes:
- a governed data pipeline
- enforced validation via ZARA
- immutable versioned datasets
- controlled promotion across environments
- secure delivery via assets layer
11. Key Decision
ZAYAZ does not run on Excel ZAYAZ runs on validated, versioned datasets
This enables:
- CSRD compliance
- ESRS traceability
- audit-grade data integrity
- scalable ESG intelligence
12. Future Extensions (v2)
- event-driven ingestion
- automated promotion workflows
- dataset signing / attestation
- verifier network integration
- real-time registry updates
13. Final Statement
ZDRS is the data backbone of ZAYAZ.
It transforms ESG data from static documents into a governed, computable, and scalable system.
APPENDIX A - Canonical Objects
A.1. Registry Definition Object
The registry definition describes:
- table identity
- PKs
- integrity rules
- write/read policy
- storage/governance metadata
Example:
{
"registry_id": "sig_residency_region_policy",
"title": "Residency Region Policy",
"description": "Residency, routing, and disaster-recovery policy per AWS primary region",
"source_type": "excel",
"source_file_name": "sig_residency_region_policy.xlsx",
"row_schema_ref": "/schemas/row_schemas/sig_residency_region_policy.row.schema.json",
"dataset_ref": "/datasets/sig_residency_region_policy.json",
"owner_module": "GRPE",
"owners": ["cto@viroway.com"],
"primary_keys": ["rrp_id"],
"schema_version": 1,
"update_frequency": "rare_manual_update",
"integrity_rules": {
"min_rows": 1,
"unique": ["primary_region"],
"not_null": [
"rrp_id",
"primary_region",
"zone",
"s3_link",
"api",
"vpc_cid",
"secondary_region",
"dr_region_sr",
"dr_encryption",
"rr_allowed"
]
},
"read_policy": {
"mode": "snapshot",
"allowed_patterns": ["by_primary_region", "by_zone", "full_scan"]
},
"write_policy": {
"mode": "governed_manual",
"change_control": "architecture_review_required"
},
"promotion_rules": {
"allow_dev_to_staging": true,
"allow_staging_to_prod": true,
"requires_validation_pass": true,
"requires_manual_approval_for_prod": true
},
"aliases": ["rr_policy", "region_residency_policy"],
"db_schema": "core_signals",
"access_scope": "platform_shared",
"availability_status": "active",
"status": "stable",
"notes": "Authoritative mapping of AWS regions to residency zones, routing endpoints, and SR/RR disaster recovery targets. Tenant policies resolve against this registry but never modify it."
}
Stored in:
/dev/schemas/registry_definitions/<registry_id>.definition.json
/staging/schemas/registry_definitions/<registry_id>.definition.json
/prod/schemas/registry_definitions/<registry_id>.definition.json
A.2. Row Validation Schema
This describes the actual row shape of the dataset.
Example:
{
"$schema": "https://json-schema.org/draft/2020-12/schema",
"title": "sig_residency_region_policy row",
"type": "object",
"required": [
"rrp_id",
"primary_region",
"zone",
"secondary_region",
"dr_region_sr",
"dr_encryption",
"rr_allowed"
],
"properties": {
"rrp_id": { "type": "string" },
"primary_region": { "type": "string" },
"zone": { "type": "string" },
"secondary_region": { "type": "string" },
"dr_region_sr": { "type": "string" },
"dr_region_rr": { "type": ["string", "null"] },
"rr_allowed": { "type": "boolean" },
"dr_encryption": { "type": "string" }
},
"additionalProperties": false
}
Store in:
/dev/schemas/row_schemas/<registry_id>.row.schema.json
/staging/schemas/row_schemas/<registry_id>.row.schema.json
/prod/schemas/row_schemas/<registry_id>.row.schema.json
A.3. Dataset Object
Normalized machine-truth payload.
Stores:
- registry metadata
- lineage
- actual records
Example:
{
"registry_id": "sig_residency_region_policy",
"version": "1.0.0",
"environment": "dev",
"lineage": {
"source_type": "excel",
"source_file_name": "sig_residency_region_policy.xlsx",
"sheet": "Sheet1",
"row_range": "2-145",
"ingested_at": "2026-03-22T00:00:00Z"
},
"records": [
{
"rrp_id": "RRP-001",
"primary_region": "eu-north-1",
"zone": "eu",
"secondary_region": "eu-west-1",
"dr_region_sr": "eu-west-3",
"dr_region_rr": null,
"rr_allowed": false,
"dr_encryption": "sse_kms_required"
}
]
}
Store in:
/dev/datasets/<registry_id>.json
/dev/datasets/<registry_id>.v1.0.0.json
/staging/datasets/<registry_id>.json
/prod/datasets/<registry_id>.json
/prod/datasets/<registry_id>.v1.0.0.json
A.4. Registry Catalog
This is the discovery index.
It answers:
- what registries exist
- where source/schema/dataset files live
- current lifecycle status
- latest version
- whether promotion is allowed
Example:
{
"catalog_version": "1.0.0",
"environment": "dev",
"generated_at": "2026-03-22T14:00:00Z",
"registries": [
{
"registry_id": "sig_residency_region_policy",
"title": "Residency Region Policy",
"description": "Residency, routing, and disaster-recovery policy per AWS primary region",
"status": "validated",
"definition_ref": "/schemas/registry_definitions/sig_residency_region_policy.definition.json",
"row_schema_ref": "/schemas/row_schemas/sig_residency_region_policy.row.schema.json",
"dataset_ref": "/datasets/sig_residency_region_policy.json",
"latest_version": "1.0.0",
"record_count": 17,
"promotion": {
"eligible_for_staging": true,
"eligible_for_prod": false
},
"owners": ["cto@viroway.com"],
"tags": ["aws", "routing", "residency", "dr"]
}
]
}
Store in:
/dev/datasets/registry_catalog.json
/staging/datasets/registry_catalog.json
/prod/datasets/registry_catalog.json
APPENDIX B -
B.1 Scripts
Below is the scaffold for the four scripts.
import fs from "node:fs/promises";
import path from "node:path";
import XLSX from "xlsx";
export type EnvName = "dev" | "staging" | "prod";
export interface RegistryDefinition {
registry_id: string;
title: string;
description?: string;
source_type: "excel";
source_file_name: string;
row_schema_ref: string;
dataset_ref: string;
owner_module?: string;
owners?: string[];
primary_keys: string[];
schema_version?: number;
update_frequency?: string;
integrity_rules?: {
min_rows?: number;
unique?: string[];
not_null?: string[];
};
read_policy?: {
mode?: string;
allowed_patterns?: string[];
};
write_policy?: {
mode?: string;
change_control?: string;
};
promotion_rules?: {
allow_dev_to_staging?: boolean;
allow_staging_to_prod?: boolean;
requires_validation_pass?: boolean;
requires_manual_approval_for_prod?: boolean;
};
aliases?: string[];
db_schema?: string;
access_scope?: string;
availability_status?: string;
status?: string;
notes?: string;
}
export interface DatasetObject {
registry_id: string;
version: string;
environment: EnvName;
lineage: {
source_type: string;
source_file_name: string;
sheet: string;
row_range?: string;
ingested_at: string;
};
records: Record<string, unknown>[];
}
export interface ValidationReport {
registry_id: string;
environment: EnvName;
validated_at: string;
status: "pass" | "fail";
summary: {
record_count: number;
error_count: number;
warning_count: number;
};
errors: string[];
warnings: string[];
}
export interface RegistryCatalogEntry {
registry_id: string;
title: string;
description?: string;
status?: string;
definition_ref: string;
row_schema_ref: string;
dataset_ref: string;
latest_version?: string;
record_count?: number;
promotion?: {
eligible_for_staging: boolean;
eligible_for_prod: boolean;
};
owners?: string[];
tags?: string[];
}
export interface RegistryCatalog {
catalog_version: string;
environment: EnvName;
generated_at: string;
registries: RegistryCatalogEntry[];
}
export function repoRoot(): string {
return process.cwd();
}
export function s3Root(env: EnvName): string {
return path.join(repoRoot(), env);
}
export function envRoot(env: EnvName): string {
return path.join(repoRoot(), env);
}
export function excelPath(env: EnvName, registryId: string): string {
return path.join(envRoot(env), "excel", `${registryId}.xlsx`);
}
export function definitionPath(env: EnvName, registryId: string): string {
return path.join(
envRoot(env),
"schemas",
"registry_definitions",
`${registryId}.definition.json`,
);
}
export function rowSchemaPath(env: EnvName, registryId: string): string {
return path.join(
envRoot(env),
"schemas",
"row_schemas",
`${registryId}.row.schema.json`,
);
}
export function datasetPath(env: EnvName, registryId: string): string {
return path.join(envRoot(env), "datasets", `${registryId}.json`);
}
export function versionedDatasetPath(
env: EnvName,
registryId: string,
version: string,
): string {
return path.join(envRoot(env), "datasets", `${registryId}.v${version}.json`);
}
export function validationReportPath(env: EnvName, registryId: string): string {
return path.join(
envRoot(env),
"exports",
"validation_reports",
`${registryId}.validation.json`,
);
}
export function generationReportPath(env: EnvName, registryId: string): string {
return path.join(
envRoot(env),
"exports",
"generation_reports",
`${registryId}.generation.json`,
);
}
export function promotionReportPath(env: EnvName, registryId: string): string {
return path.join(
envRoot(env),
"exports",
"promotion_reports",
`${registryId}.promotion.json`,
);
}
export function registryCatalogPath(env: EnvName): string {
return path.join(envRoot(env), "datasets", "registry_catalog.json");
}
export async function ensureDir(filePath: string): Promise<void> {
await fs.mkdir(path.dirname(filePath), { recursive: true });
}
export async function readJsonFile<T>(filePath: string): Promise<T> {
const raw = await fs.readFile(filePath, "utf8");
return JSON.parse(raw) as T;
}
export async function writeJsonFile(
filePath: string,
value: unknown,
): Promise<void> {
await ensureDir(filePath);
await fs.writeFile(filePath, JSON.stringify(value, null, 2), "utf8");
}
export async function fileExists(filePath: string): Promise<boolean> {
try {
await fs.access(filePath);
return true;
} catch {
return false;
}
}
export function nowIso(): string {
return new Date().toISOString();
}
export function parseArgs(argv: string[]): Record<string, string | boolean> {
const out: Record<string, string | boolean> = {};
for (let i = 0; i < argv.length; i += 1) {
const arg = argv[i];
if (!arg.startsWith("--")) continue;
const key = arg.slice(2);
const next = argv[i + 1];
if (!next || next.startsWith("--")) {
out[key] = true;
continue;
}
out[key] = next;
i += 1;
}
return out;
}
export async function listRegistryIds(env: EnvName): Promise<string[]> {
const dir = path.join(envRoot(env), "schemas", "registry_definitions");
const exists = await fileExists(dir);
if (!exists) return [];
const entries = await fs.readdir(dir);
return entries
.filter((f) => f.endsWith(".definition.json"))
.map((f) => f.replace(/\.definition\.json$/, ""))
.sort();
}
export function normalizeHeader(value: unknown): string {
return String(value ?? "")
.trim()
.replace(/\s+/g, "_")
.replace(/[^\w]/g, "_")
.replace(/_+/g, "_")
.replace(/^_+|_+$/g, "")
.toLowerCase();
}
export function normalizeCell(value: unknown): unknown {
if (value === undefined || value === null) return null;
if (typeof value === "string") {
const trimmed = value.trim();
if (trimmed === "") return null;
if (trimmed.toLowerCase() === "true") return true;
if (trimmed.toLowerCase() === "false") return false;
if (!Number.isNaN(Number(trimmed)) && trimmed !== "") return Number(trimmed);
return trimmed;
}
return value;
}
export function readExcelRecords(
filePath: string,
): { sheetName: string; records: Record<string, unknown>[] } {
const wb = XLSX.readFile(filePath, { cellDates: false });
const firstSheetName = wb.SheetNames[0];
if (!firstSheetName) {
throw new Error(`Workbook has no sheets: ${filePath}`);
}
const sheet = wb.Sheets[firstSheetName];
if (!sheet) {
throw new Error(`Missing first sheet: ${filePath}`);
}
const rows = XLSX.utils.sheet_to_json<unknown[]>(sheet, {
header: 1,
raw: false,
});
if (rows.length === 0) {
return { sheetName: firstSheetName, records: [] };
}
const headers = (rows[0] ?? []).map(normalizeHeader);
const records: Record<string, unknown>[] = [];
for (let i = 1; i < rows.length; i += 1) {
const row = rows[i] ?? [];
const obj: Record<string, unknown> = {};
let hasValue = false;
for (let c = 0; c < headers.length; c += 1) {
const key = headers[c];
if (!key) continue;
const value = normalizeCell(row[c]);
if (value !== null) hasValue = true;
obj[key] = value;
}
if (hasValue) records.push(obj);
}
return { sheetName: firstSheetName, records };
}
export function assertEnv(value: string): EnvName {
if (value === "dev" || value === "staging" || value === "prod") return value;
throw new Error(`Invalid environment: ${value}`);
}
import {
DatasetObject,
EnvName,
RegistryDefinition,
datasetPath,
definitionPath,
excelPath,
generationReportPath,
listRegistryIds,
nowIso,
parseArgs,
readExcelRecords,
readJsonFile,
versionedDatasetPath,
writeJsonFile,
} from "./_shared.js";
async function generateOne(registryId: string, env: EnvName): Promise<void> {
const definition = await readJsonFile<RegistryDefinition>(
definitionPath(env, registryId),
);
if (definition.source_type !== "excel") {
throw new Error(
`${registryId}: only source_type=excel is supported in v1`,
);
}
const sourcePath = excelPath(env, registryId);
const { sheetName, records } = readExcelRecords(sourcePath);
const version = "1.0.0";
const dataset: DatasetObject = {
registry_id: definition.registry_id,
version,
environment: env,
lineage: {
source_type: definition.source_type,
source_file_name: definition.source_file_name,
sheet: sheetName,
ingested_at: nowIso(),
},
records,
};
await writeJsonFile(datasetPath(env, registryId), dataset);
await writeJsonFile(versionedDatasetPath(env, registryId, version), dataset);
await writeJsonFile(generationReportPath(env, registryId), {
registry_id: registryId,
generated_at: nowIso(),
environment: env,
version,
record_count: records.length,
source_file_name: definition.source_file_name,
sheet: sheetName,
status: "success",
});
console.log(
`[gen:gen-registry] ${registryId}: wrote ${records.length} records to ${env}/datasets`,
);
}
async function main(): Promise<void> {
const args = parseArgs(process.argv.slice(2));
const env = (args.env as string | undefined) ?? "dev";
let registryIds: string[] = [];
if (args.all) {
registryIds = await listRegistryIds(env as EnvName);
} else if (typeof args.registry === "string") {
registryIds = [args.registry];
} else {
throw new Error("Provide --registry <id> or --all");
}
for (const registryId of registryIds) {
await generateOne(registryId, env as EnvName);
}
}
main().catch((error) => {
console.error("[gen:gen-registry] failed:", error);
process.exit(1);
});
import { type ErrorObject } from "ajv";
import {
DatasetObject,
EnvName,
RegistryDefinition,
ValidationReport,
datasetPath,
definitionPath,
parseArgs,
readJsonFile,
rowSchemaPath,
validationReportPath,
writeJsonFile,
listRegistryIds,
nowIso,
} from "./_shared.js";
function buildCompositeKey(
record: Record<string, unknown>,
fields: string[],
): string {
return fields.map((f) => JSON.stringify(record[f] ?? null)).join("|");
}
async function validateOne(registryId: string, env: EnvName): Promise<void> {
const definition = await readJsonFile<RegistryDefinition>(
definitionPath(env, registryId),
);
const dataset = await readJsonFile<DatasetObject>(datasetPath(env, registryId));
const rowSchema = await readJsonFile<object>(rowSchemaPath(env, registryId));
const errors: string[] = [];
const warnings: string[] = [];
if (!dataset.registry_id) errors.push("Missing dataset.registry_id");
if (!dataset.version) errors.push("Missing dataset.version");
if (!dataset.environment) errors.push("Missing dataset.environment");
if (!Array.isArray(dataset.records)) errors.push("dataset.records must be an array");
if (!dataset.lineage) errors.push("Missing dataset.lineage");
type AjvCtor = new (options?: {
allErrors?: boolean;
strict?: boolean;
}) => {
compile: (schema: object) => {
(data: unknown): boolean;
errors?: ErrorObject[] | null;
};
};
const ajvModule = await import("ajv");
const Ajv = ajvModule.default as unknown as AjvCtor;
const ajv = new Ajv({ allErrors: true, strict: false });
const validateRow = ajv.compile(rowSchema);
for (let i = 0; i < dataset.records.length; i += 1) {
const row = dataset.records[i];
const valid = validateRow(row);
if (!valid) {
const rowErrors =
validateRow.errors?.map(
(e: ErrorObject) => `row ${i + 1}: ${e.instancePath} ${e.message ?? ""}`
) ?? [];
errors.push(...rowErrors);
}
}
const integrity = definition.integrity_rules ?? {};
if (
typeof integrity.min_rows === "number" &&
dataset.records.length < integrity.min_rows
) {
errors.push(
`Expected at least ${integrity.min_rows} rows, found ${dataset.records.length}`,
);
}
for (const field of integrity.not_null ?? []) {
dataset.records.forEach((row, idx) => {
if (row[field] === null || row[field] === undefined || row[field] === "") {
errors.push(`row ${idx + 1}: field '${field}' must not be null`);
}
});
}
for (const field of integrity.unique ?? []) {
const seen = new Map<unknown, number>();
dataset.records.forEach((row, idx) => {
const value = row[field];
if (value === null || value === undefined) return;
if (seen.has(value)) {
errors.push(
`Duplicate unique field '${field}' at row ${idx + 1}; first seen at row ${seen.get(value)}`,
);
} else {
seen.set(value, idx + 1);
}
});
}
if (definition.primary_keys.length > 0) {
const seen = new Map<string, number>();
dataset.records.forEach((row, idx) => {
const key = buildCompositeKey(row, definition.primary_keys);
if (seen.has(key)) {
errors.push(
`Duplicate primary key at row ${idx + 1}; first seen at row ${seen.get(key)}`,
);
} else {
seen.set(key, idx + 1);
}
});
}
const report: ValidationReport = {
registry_id: registryId,
environment: env,
validated_at: nowIso(),
status: errors.length === 0 ? "pass" : "fail",
summary: {
record_count: dataset.records.length,
error_count: errors.length,
warning_count: warnings.length,
},
errors,
warnings,
};
await writeJsonFile(validationReportPath(env, registryId), report);
if (report.status === "fail") {
console.error(
`[validate:registry] ${registryId}: FAIL (${errors.length} errors)`,
);
return;
}
console.log(
`[validate:registry] ${registryId}: PASS (${dataset.records.length} records)`,
);
}
async function main(): Promise<void> {
const args = parseArgs(process.argv.slice(2));
const env = (args.env as string | undefined) ?? "dev";
let registryIds: string[] = [];
if (args.all) {
registryIds = await listRegistryIds(env as EnvName);
} else if (typeof args.registry === "string") {
registryIds = [args.registry];
} else {
throw new Error("Provide --registry <id> or --all");
}
let hasFailure = false;
for (const registryId of registryIds) {
await validateOne(registryId, env as EnvName);
const report = await readJsonFile<ValidationReport>(
validationReportPath(env as EnvName, registryId),
);
if (report.status !== "pass") hasFailure = true;
}
if (hasFailure) process.exit(1);
}
main().catch((error) => {
console.error("[validate:registry] failed:", error);
process.exit(1);
});
import fs from "node:fs/promises";
import {
EnvName,
RegistryDefinition,
ValidationReport,
assertEnv,
datasetPath,
definitionPath,
fileExists,
parseArgs,
promotionReportPath,
readJsonFile,
rowSchemaPath,
validationReportPath,
versionedDatasetPath,
writeJsonFile,
} from "./_shared.js";
async function copyFileSafe(from: string, to: string): Promise<void> {
await fs.mkdir(to.replace(/\/[^/]+$/, ""), { recursive: true });
await fs.copyFile(from, to);
}
function assertAllowedTransition(from: EnvName, to: EnvName): void {
const allowed =
(from === "dev" && to === "staging") ||
(from === "staging" && to === "prod");
if (!allowed) {
throw new Error(`Invalid promotion path: ${from} -> ${to}`);
}
}
async function promoteOne(
registryId: string,
from: EnvName,
to: EnvName,
approve: boolean,
): Promise<void> {
assertAllowedTransition(from, to);
const definition = await readJsonFile<RegistryDefinition>(
definitionPath(from, registryId),
);
const validation = await readJsonFile<ValidationReport>(
validationReportPath(from, registryId),
);
if (definition.promotion_rules?.requires_validation_pass && validation.status !== "pass") {
throw new Error(`${registryId}: validation must pass before promotion`);
}
if (from === "dev" && !definition.promotion_rules?.allow_dev_to_staging) {
throw new Error(`${registryId}: dev->staging not allowed`);
}
if (from === "staging" && !definition.promotion_rules?.allow_staging_to_prod) {
throw new Error(`${registryId}: staging->prod not allowed`);
}
if (
to === "prod" &&
definition.promotion_rules?.requires_manual_approval_for_prod &&
!approve
) {
throw new Error(`${registryId}: prod promotion requires --approve`);
}
const dataset = await readJsonFile<{ version: string }>(datasetPath(from, registryId));
const version = dataset.version;
const sourceLatest = datasetPath(from, registryId);
const sourceVersioned = versionedDatasetPath(from, registryId, version);
const sourceDefinition = definitionPath(from, registryId);
const sourceRowSchema = rowSchemaPath(from, registryId);
const targetLatest = datasetPath(to, registryId);
const targetVersioned = versionedDatasetPath(to, registryId, version);
const targetDefinition = definitionPath(to, registryId);
const targetRowSchema = rowSchemaPath(to, registryId);
if (!(await fileExists(sourceLatest))) throw new Error(`Missing dataset: ${sourceLatest}`);
if (!(await fileExists(sourceVersioned))) throw new Error(`Missing versioned dataset: ${sourceVersioned}`);
await copyFileSafe(sourceLatest, targetLatest);
await copyFileSafe(sourceVersioned, targetVersioned);
await copyFileSafe(sourceDefinition, targetDefinition);
await copyFileSafe(sourceRowSchema, targetRowSchema);
await writeJsonFile(promotionReportPath(to, registryId), {
registry_id: registryId,
from,
to,
promoted_at: new Date().toISOString(),
version,
status: "success",
});
console.log(`[promote:registry] ${registryId}: ${from} -> ${to} (v${version})`);
}
async function main(): Promise<void> {
const args = parseArgs(process.argv.slice(2));
if (typeof args.registry !== "string") {
throw new Error("Provide --registry <id>");
}
if (typeof args.from !== "string" || typeof args.to !== "string") {
throw new Error("Provide --from <env> and --to <env>");
}
const from = assertEnv(args.from);
const to = assertEnv(args.to);
const approve = Boolean(args.approve);
await promoteOne(args.registry, from, to, approve);
}
main().catch((error) => {
console.error("[promote:registry] failed:", error);
process.exit(1);
});
import {
DatasetObject,
EnvName,
RegistryCatalog,
RegistryCatalogEntry,
RegistryDefinition,
datasetPath,
definitionPath,
fileExists,
listRegistryIds,
nowIso,
parseArgs,
readJsonFile,
registryCatalogPath,
validationReportPath,
writeJsonFile,
} from "./_shared.js";
async function buildEntry(
registryId: string,
env: EnvName,
): Promise<RegistryCatalogEntry | null> {
const defPath = definitionPath(env, registryId);
const dataPath = datasetPath(env, registryId);
if (!(await fileExists(defPath)) || !(await fileExists(dataPath))) {
return null;
}
const definition = await readJsonFile<RegistryDefinition>(defPath);
const dataset = await readJsonFile<DatasetObject>(dataPath);
let status = definition.status;
if (await fileExists(validationReportPath(env, registryId))) {
const validation = await readJsonFile<{ status: string }>(
validationReportPath(env, registryId),
);
status = validation.status === "pass" ? "validated" : "invalid";
}
return {
registry_id: definition.registry_id,
title: definition.title,
description: definition.description,
status,
definition_ref: `/schemas/registry_definitions/${registryId}.definition.json`,
row_schema_ref: definition.row_schema_ref,
dataset_ref: `/datasets/${registryId}.json`,
latest_version: dataset.version,
record_count: dataset.records.length,
promotion: {
eligible_for_staging:
Boolean(definition.promotion_rules?.allow_dev_to_staging) &&
status === "validated",
eligible_for_prod:
Boolean(definition.promotion_rules?.allow_staging_to_prod) &&
status === "validated",
},
owners: definition.owners,
tags: definition.aliases ?? [],
};
}
async function main(): Promise<void> {
const args = parseArgs(process.argv.slice(2));
const env = ((args.env as string | undefined) ?? "dev") as EnvName;
const registryIds = await listRegistryIds(env);
const entries: RegistryCatalogEntry[] = [];
for (const registryId of registryIds) {
const entry = await buildEntry(registryId, env);
if (entry) entries.push(entry);
}
const catalog: RegistryCatalog = {
catalog_version: "1.0.0",
environment: env,
generated_at: nowIso(),
registries: entries,
};
await writeJsonFile(registryCatalogPath(env), catalog);
console.log(
`[catalog:registry] ${env}: wrote ${entries.length} entries to registry_catalog.json`,
);
}
main().catch((error) => {
console.error("[catalog:registry] failed:", error);
process.exit(1);
});
B.2. package.json Scripts:
{
"scripts": {
"gen:gen-registry": "ts-node scripts/registry/gen-registry.ts",
"validate:registry": "ts-node scripts/registry/validate-registry.ts",
"promote:registry": "ts-node scripts/registry/promote-registry.ts",
"catalog:registry": "ts-node scripts/registry/build-registry-catalog.ts"
}
}
B.3. Needed Packages
npm install ajv xlsx
npm install -D ts-node typescript @types/node