Skip to main content

ZAMUS

ZAYAZ Asset Naming & URL Strategy

1. Purpose

This document defines the canonical strategy for:

  • asset naming
  • asset URL structure
  • storage boundaries
  • environment separation
  • public vs restricted access
  • future resolver compatibility

It applies to:

  • Excel workbooks
  • downloadable documentation artifacts
  • images
  • schemas
  • generated exports
  • future registry-backed files
  • future SSSR-addressable file assets

This document is normative.


2. Design Principles

2.1. Asset identity must be stable

A file may move between:

  • GitHub
  • AWS
  • Cloudflare-backed delivery
  • future asset resolvers

Its public identity should remain stable even if storage location changes.


2.2. URLs must be human-readable

Asset URLs should be predictable and understandable.

Bad:

/files/8fa21ab93c1f2b

Good: /excel/compute_method_registry.xlsx


2.3. Storage location and public URL are different concerns

The physical storage path in AWS is not the same thing as the public URL.

  • storage path = infrastructure concern
  • public URL = consumer contract

2.4. Naming must support future scale

The naming model must work for:

  • a few files today
  • thousands of assets later
  • multiple environments
  • future tenancy
  • future SSSR-backed resolvers

MDX and docs metadata should point to a stable URL domain, not to raw storage internals.


3. Asset Classes

ZAYAZ assets should be grouped into a small number of top-level classes.

  • excel
  • schemas
  • exports
  • images
  • docs-assets
  • registry
  • downloads
  • generated

These are URL and governance classes, not necessarily storage buckets.


3.2. Meaning of each class

ClassPurpose
excelSource workbook libraries and downloadable spreadsheets
schemasJSON schemas, table manifests, structural definitions
exportsGenerated exports for download
imagesDocumentation-facing static images
docs-assetsOther page-linked assets used directly in docs
registryFuture structured registry-backed downloadable artifacts
downloadsGeneral downloadable files not fitting another class
generatedMachine-generated artifacts intended for publication

4. Canonical Public Asset Domains

Use a dedicated asset domain:

assets.zayaz.io

This should become the primary public asset root for documentation-linked external artifacts.


4.2. Optional secondary domains

Only introduce additional domains if there is a clear operational reason.

Examples:

  • files.zayaz.io
  • downloads.zayaz.io
  • assets.zayaz.dev

Default recommendation:

  • use one primary asset domain first
  • avoid domain sprawl

5. Canonical URL Structure

5.1. Primary public format

https://assets.zayaz.io/<asset_class>/<asset_name>.<ext>

Examples:

https://assets.zayaz.io/schemas/compute-method-registry.schema.json
https://assets.zayaz.io/images/system-landscape-overview.png

5.2. Primary development format

https://assets.zayaz.dev/<asset_class>/<asset_name>.<ext>

Examples:

https://assets.zayaz.dev/excel/compute_method_registry.xlsx
https://assets.zayaz.dev/excel/altd_event.xlsx

These is the preferred contracts.


5.3. Optional grouped format

If needed for scaling, use one additional path segment:

https://assets.zayaz.io/<asset_class>/<group>/<asset_name>.<ext>

https://assets.zayaz.dev/<asset_class>/<group>/<asset_name>.<ext>

Examples:

https://assets.zayaz.dev/excel/core/compute_method_registry.xlsx
https://assets.zayaz.dev/excel/audit/altd_event.xlsx
https://assets.zayaz.io/schemas/registry/compute-method-registry.schema.json

Use grouping only when it improves clarity.


5.3. What not to expose publicly

Do not expose public URLs like:

https://bucket-name.s3.amazonaws.com/folder/file.xlsx https://assets.zayaz.io/prod-eu-west-1-bucket/internal/abc123.xlsx

Avoid leaking:

  • bucket names
  • environment internals
  • region internals
  • tenant internals
  • random storage IDs

6. Canonical Asset Naming Rules

6.1. Base naming convention

Use lowercase kebab-case or snake_case consistently.

Recommended default:

snake_case

Examples:

compute_method_registry.xlsx
altd_event.xlsx
residency_region_policy.json
docs_table_manifest.json

Reason:

  • matches many registry/table naming habits
  • works well for generated paths
  • stable across systems

6.2. Allowed characters

Asset names should use only:

  • a-z
  • 0-9
  • _

Avoid:

  • spaces
  • mixed casing
  • punctuation beyond - and _
  • locale-specific characters

6.3. Extension rules

Always keep the real extension.

Examples:

  • .xlsx
  • .json
  • .csv
  • .png
  • .svg
  • .pdf

Do not hide real format behind extensionless URLs unless a resolver later explicitly supports that.


6.4. Version suffixes

Only add version suffixes when multiple public versions must coexist.

Pattern:

<asset_name>.v<major>.<minor>.<ext>

Examples:

compute_method_registry.v1.0.xlsx
residency_region_policy.v1.2.json

Default rule:

  • if only the latest version is public, keep the clean name
  • versioning belongs in metadata first, URL only when needed

7. Storage Path Strategy in AWS

Public URLs should map to storage paths, but not expose them directly.

s3://<bucket>/<environment>/<asset_class>/<asset_name>.<ext>

Example:

s3://zayaz-assets/dev/excel/compute_method_registry.xlsx
s3://zayaz-assets/prod/schemas/residency_region_policy.json

Optional grouped example:

s3://zayaz-assets/dev/excel/core/compute_method_registry.xlsx


7.2. Environment-aware storage

Use environment in storage path, not in the public production URL.

Examples:

s3://zayaz-assets/dev/excel/compute_method_registry.xlsx
s3://zayaz-assets/prod/excel/compute_method_registry.xlsx

Public URL remains:

https://assets.zayaz.dev/excel/compute_method_registry.xlsx


7.3. Public URL mapping layer

Cloudflare or another delivery layer should map:

https://assets.zayaz.dev/excel/compute_method_registry.xlsx

to the correct backend object for the active environment.


8. Environment URL Strategy

8.1. Production

https://assets.zayaz.io/<asset_class>/<asset_name>.<ext>


8.2. Development / testing

Use a non-production root such as:

https://assets.zayaz.dev/<asset_class>/<asset_name>.<ext>

Preferred rule:

  • production and non-production asset domains must be clearly separated

8.3. Rule

Do not mix production and non-production artifacts under the same public asset root.


9. Excel Asset Strategy

9.1. Public documentation pattern

Docs should reference Excel files like this:

excel-referencing.jsonGitHub ↗
{
"id": "compute_method_registry",
"description": "ZAYAZ compute methods registry (schemas, implementations, dependencies).",
"url": "https://assets.zayaz.dev/excel/compute_method_registry.xlsx"
}

This replaces GitHub tree links and static /excel/... assumptions.


9.2. Canonical Excel naming rule

Each Excel workbook should use a stable canonical name based on the table or library identity.

Examples:

altd_event.xlsx
compute_method_registry.xlsx
residency_region_policy.xlsx
material_topic_registry.xlsx

9.3. Excel URL rule

https://assets.zayaz.dev/excel/<canonical_name>.xlsx

No query strings required for normal public access.


10. Metadata Strategy

Documentation-facing metadata should reference canonical URLs, not raw storage paths.

10.1. Preferred metadata shape

metadata-shape.jsonGitHub ↗
{
"id": "compute_method_registry",
"description": "ZAYAZ compute methods registry (schemas, implementations, dependencies).",
"asset_class": "excel",
"file_name": "compute_method_registry.xlsx",
"url": "https://assets.zayaz.dev/excel/compute_method_registry.xlsx"
}

10.2. Future-compatible metadata shape

If future resolver support is introduced:

future-resolver-support.jsonGitHub ↗
{
"id": "compute_method_registry",
"description": "ZAYAZ compute methods registry (schemas, implementations, dependencies).",
"asset_class": "excel",
"file_name": "compute_method_registry.xlsx",
"url": "https://assets.zayaz.dev/excel/compute_method_registry.xlsx",
"sssr_ref": "sssr:asset.docs-assets.compute_method_registry.xlsx"
}

This is compatible with future registry alignment.


11. GitHub vs AWS Asset Rules

11.1. Keep in GitHub if:

  • asset is small
  • page-coupled
  • directly authored with docs
  • useful for local editing
  • not a bulk source library

Examples:

  • small example JSON
  • page-specific diagrams
  • snippet-associated examples

11.2. Move to AWS if:

  • asset is large
  • part of a workbook library
  • expected to scale
  • more archival than editorial
  • not needed inside Docusaurus runtime

Examples:

  • Excel libraries
  • large downloadable registries
  • large generated exports

12. Search / Ingestion Compatibility

The current Excel ingestion logic is compatible with AWS-hosted Excel libraries as long as the search/index layer can access:

  • the AWS storage location directly
  • or a synced local cache
  • or a controlled export mirror

This means moving Excel files to AWS does not invalidate the indexing architecture.


13. Anti-Patterns

Avoid:

  • raw S3 public URLs in docs metadata
  • embedding environment internals in public file paths
  • storing large workbook libraries in docusaurus/static/excel
  • naming files with spaces or unstable human labels
  • leaking bucket structure into user-facing links
  • mixing production and dev assets under the same URL root

Phase 1

  • define canonical asset domain
  • define canonical naming rules
  • keep current docs metadata structure
  • begin replacing GitHub/static Excel links with canonical asset URLs

Phase 2

  • move Excel files to AWS storage
  • wire Cloudflare asset delivery
  • preserve stable public URLs

Phase 3

  • introduce richer metadata
  • optionally add resolver/registry integration
  • support versioned assets where needed

15. Excel → Registry Ingestion Pipeline (ZAYAZ Standard)

15.1. Purpose

This section defines the canonical pipeline for transforming Excel-based source data into:

  • structured registries
  • machine-readable datasets
  • runtime-ready ZAYAZ data assets

This pipeline is foundational for:

  • CSRD / ESRS compliance
  • auditability
  • traceability
  • validator engines (ZARA)
  • Computation Hub inputs
  • sustainability intelligence systems

15.2. Architectural Role

The Excel ingestion pipeline sits between:

Authoring Layer (Excel)

Ingestion Pipeline

Structured Registry Layer

ZAYAZ Runtime (APIs, Computation Hub, Reporting)

15.3. Key Principle

Excel is authoring format, not runtime format

All production systems must consume:

  • JSON
  • registry objects
  • database records

—not raw Excel files.


15.4. Pipeline Stages

Stage 1 — Source (Excel) Location:

https://assets.zayaz.dev/excel/<table>.xlsx

Characteristics:

  • human-authored
  • versioned
  • potentially incomplete
  • not validated

Stage 2 — Ingestion Component:

code/infrastructure/zayaz-search-indexer/src/ingest-excel.ts

Responsibilities:

  • discover Excel files
  • read sheets
  • extract:
    • table name
    • columns
    • sample rows
  • generate raw document objects

Stage 3 — Validation Layer (NEW — REQUIRED) This layer must be introduced.

Responsibilities:

  • schema validation (per table)
  • required column enforcement
  • data type validation
  • controlled vocabularies (e.g. ESRS dimensions)
  • referential integrity (future)

Example:

compute_method_registry.xlsx
→ validate against compute-method-registry.schema.json

Failure behavior:

  • block pipeline OR
  • flag violations (ZARA integration)

Stage 4 — Normalization Transform Excel rows into canonical structures:

transform-excel-rows.jsonGitHub ↗
{
"id": "compute_method_001",
"name": "Scope 1 combustion",
"unit": "kgCO2e",
"methodology": "IPCC",
"dependencies": ["fuel_type", "quantity"]
}

Normalization includes:

  • trimming
  • typing
  • mapping column names → canonical fields
  • resolving enums
  • generating IDs if missing

Stage 5 — Registry Generation Output structured registry artifacts:

registry/<table>.json

Example:

registry/compute_method_registry.json
registry/altd_event.json

Location (prod):

https://assets.zayaz.io/registry/<table>.json


Stage 6 — Indexing (Optional / Current) The existing search pipeline:

zayaz-search-indexer

Can consume:

  • Excel directly (current)
  • OR preferably normalized registry JSON (future)

👉 Recommended evolution:

Index registry JSON, not Excel


Stage 7 — Runtime Consumption Used by:

  • ZARA validator engine
  • Computation Hub
  • Reporting Hub
  • ESG analytics
  • APIs

15.5. Data Model Contracts

Each Excel table must map to:

LayerContract
ExcelHuman-readable table
SchemaJSON schema definition
RegistryNormalized JSON
RuntimeAPI / DB model

15.6. Schema Governance

Each table should have a schema:

schemas/<table>.schema.json

Example:

schemas/compute_method_registry.schema.json

Schema defines:

  • required fields
  • types
  • enums
  • relationships

15.7. Versioning Strategy

Excel (authoring)

  • version optional
  • managed by editor workflow

Registry (production) Two strategies:

A — Latest only

registry/compute_method_registry.json

B — Versioned

registry/compute_method_registry.v1.0.json

Recommended:

  • start with latest
  • introduce versioning when needed for audit

15.8. Audit & Traceability (CRITICAL for ESG)

Each registry entry should support:

registry-entry.jsonGitHub ↗
{
"source": {
"file": "compute_method_registry.xlsx",
"sheet": "Sheet1",
"row": 42
},
"ingested_at": "2026-03-21T12:00:00Z",
"version": "v1"
}

This enables:

  • audit trails (CSRD requirement)
  • verification workflows
  • data lineage tracking

15.9. Integration with ZARA

ZARA should validate:

  • schema compliance
  • missing required fields
  • inconsistent units
  • invalid references

Pipeline integration:

Excel → Validation → ZARA violations → Registry

Violations stored in:

docusaurus/static/system/violations.json


15.10. Integration with Computation Hub

Registries become:

  • input datasets
  • dependency graphs
  • compute method definitions

Example:

compute_method_registry → Computation Hub execution models


15.11. Deployment Model

DEV

assets.zayaz.dev/excel/*.xlsx

PROD

assets.zayaz.io/registry/*.json


15.12. Anti-Patterns

Avoid:

  • using Excel directly in runtime
  • skipping validation
  • storing large Excel files in GitHub long-term
  • mixing dev and prod asset domains
  • bypassing registry layer

15.13. Future Enhancements

Planned evolution:

  • SSSR-backed registry identifiers
  • Graph-based table relationships
  • real-time ingestion pipelines
  • validator networks
  • signed registry snapshots (audit-grade)
  • carbon accounting traceability layers

16. Final Rule

Public asset URLs must be stable.

Storage paths may change.

Documentation must link to the stable contract, not the storage implementation.

Excel is input

Registry is truth

APIs are delivery

This pipeline is the backbone of ZAYAZ’s decision-grade ESG infrastructure.




GitHub RepoRequest for Change (RFC)