ZAMUS
ZAYAZ Asset Naming & URL Strategy
1. Purpose
This document defines the canonical strategy for:
- asset naming
- asset URL structure
- storage boundaries
- environment separation
- public vs restricted access
- future resolver compatibility
It applies to:
- Excel workbooks
- downloadable documentation artifacts
- images
- schemas
- generated exports
- future registry-backed files
- future SSSR-addressable file assets
This document is normative.
2. Design Principles
2.1. Asset identity must be stable
A file may move between:
- GitHub
- AWS
- Cloudflare-backed delivery
- future asset resolvers
Its public identity should remain stable even if storage location changes.
2.2. URLs must be human-readable
Asset URLs should be predictable and understandable.
Bad:
/files/8fa21ab93c1f2b
Good:
/excel/compute_method_registry.xlsx
2.3. Storage location and public URL are different concerns
The physical storage path in AWS is not the same thing as the public URL.
- storage path = infrastructure concern
- public URL = consumer contract
2.4. Naming must support future scale
The naming model must work for:
- a few files today
- thousands of assets later
- multiple environments
- future tenancy
- future SSSR-backed resolvers
2.5. Documentation should link to stable asset URLs
MDX and docs metadata should point to a stable URL domain, not to raw storage internals.
3. Asset Classes
ZAYAZ assets should be grouped into a small number of top-level classes.
3.1. Recommended top-level classes
- excel
- schemas
- exports
- images
- docs-assets
- registry
- downloads
- generated
These are URL and governance classes, not necessarily storage buckets.
3.2. Meaning of each class
| Class | Purpose |
|---|---|
| excel | Source workbook libraries and downloadable spreadsheets |
| schemas | JSON schemas, table manifests, structural definitions |
| exports | Generated exports for download |
| images | Documentation-facing static images |
| docs-assets | Other page-linked assets used directly in docs |
| registry | Future structured registry-backed downloadable artifacts |
| downloads | General downloadable files not fitting another class |
| generated | Machine-generated artifacts intended for publication |
4. Canonical Public Asset Domains
4.1. Recommended primary public asset domain
Use a dedicated asset domain:
assets.zayaz.io
This should become the primary public asset root for documentation-linked external artifacts.
4.2. Optional secondary domains
Only introduce additional domains if there is a clear operational reason.
Examples:
- files.zayaz.io
- downloads.zayaz.io
- assets.zayaz.dev
Default recommendation:
- use one primary asset domain first
- avoid domain sprawl
5. Canonical URL Structure
5.1. Primary public format
https://assets.zayaz.io/<asset_class>/<asset_name>.<ext>
Examples:
https://assets.zayaz.io/schemas/compute-method-registry.schema.json
https://assets.zayaz.io/images/system-landscape-overview.png
5.2. Primary development format
https://assets.zayaz.dev/<asset_class>/<asset_name>.<ext>
Examples:
https://assets.zayaz.dev/excel/compute_method_registry.xlsx
https://assets.zayaz.dev/excel/altd_event.xlsx
These is the preferred contracts.
5.3. Optional grouped format
If needed for scaling, use one additional path segment:
https://assets.zayaz.io/<asset_class>/<group>/<asset_name>.<ext>
https://assets.zayaz.dev/<asset_class>/<group>/<asset_name>.<ext>
Examples:
https://assets.zayaz.dev/excel/core/compute_method_registry.xlsx
https://assets.zayaz.dev/excel/audit/altd_event.xlsx
https://assets.zayaz.io/schemas/registry/compute-method-registry.schema.json
Use grouping only when it improves clarity.
5.3. What not to expose publicly
Do not expose public URLs like:
https://bucket-name.s3.amazonaws.com/folder/file.xlsx https://assets.zayaz.io/prod-eu-west-1-bucket/internal/abc123.xlsx
Avoid leaking:
- bucket names
- environment internals
- region internals
- tenant internals
- random storage IDs
6. Canonical Asset Naming Rules
6.1. Base naming convention
Use lowercase kebab-case or snake_case consistently.
Recommended default:
snake_case
Examples:
compute_method_registry.xlsx
altd_event.xlsx
residency_region_policy.json
docs_table_manifest.json
Reason:
- matches many registry/table naming habits
- works well for generated paths
- stable across systems
6.2. Allowed characters
Asset names should use only:
- a-z
- 0-9
- _
-
Avoid:
- spaces
- mixed casing
- punctuation beyond - and _
- locale-specific characters
6.3. Extension rules
Always keep the real extension.
Examples:
.xlsx.json.csv.png.svg.pdf
Do not hide real format behind extensionless URLs unless a resolver later explicitly supports that.
6.4. Version suffixes
Only add version suffixes when multiple public versions must coexist.
Pattern:
<asset_name>.v<major>.<minor>.<ext>
Examples:
compute_method_registry.v1.0.xlsx
residency_region_policy.v1.2.json
Default rule:
- if only the latest version is public, keep the clean name
- versioning belongs in metadata first, URL only when needed
7. Storage Path Strategy in AWS
Public URLs should map to storage paths, but not expose them directly.
7.1. Recommended storage pattern
s3://<bucket>/<environment>/<asset_class>/<asset_name>.<ext>
Example:
s3://zayaz-assets/dev/excel/compute_method_registry.xlsx
s3://zayaz-assets/prod/schemas/residency_region_policy.json
Optional grouped example:
s3://zayaz-assets/dev/excel/core/compute_method_registry.xlsx
7.2. Environment-aware storage
Use environment in storage path, not in the public production URL.
Examples:
s3://zayaz-assets/dev/excel/compute_method_registry.xlsx
s3://zayaz-assets/prod/excel/compute_method_registry.xlsx
Public URL remains:
https://assets.zayaz.dev/excel/compute_method_registry.xlsx
7.3. Public URL mapping layer
Cloudflare or another delivery layer should map:
https://assets.zayaz.dev/excel/compute_method_registry.xlsx
to the correct backend object for the active environment.
8. Environment URL Strategy
8.1. Production
https://assets.zayaz.io/<asset_class>/<asset_name>.<ext>
8.2. Development / testing
Use a non-production root such as:
https://assets.zayaz.dev/<asset_class>/<asset_name>.<ext>
Preferred rule:
- production and non-production asset domains must be clearly separated
8.3. Rule
Do not mix production and non-production artifacts under the same public asset root.
9. Excel Asset Strategy
9.1. Public documentation pattern
Docs should reference Excel files like this:
{
"id": "compute_method_registry",
"description": "ZAYAZ compute methods registry (schemas, implementations, dependencies).",
"url": "https://assets.zayaz.dev/excel/compute_method_registry.xlsx"
}
This replaces GitHub tree links and static /excel/... assumptions.
9.2. Canonical Excel naming rule
Each Excel workbook should use a stable canonical name based on the table or library identity.
Examples:
altd_event.xlsx
compute_method_registry.xlsx
residency_region_policy.xlsx
material_topic_registry.xlsx
9.3. Excel URL rule
https://assets.zayaz.dev/excel/<canonical_name>.xlsx
No query strings required for normal public access.
10. Metadata Strategy
Documentation-facing metadata should reference canonical URLs, not raw storage paths.
10.1. Preferred metadata shape
{
"id": "compute_method_registry",
"description": "ZAYAZ compute methods registry (schemas, implementations, dependencies).",
"asset_class": "excel",
"file_name": "compute_method_registry.xlsx",
"url": "https://assets.zayaz.dev/excel/compute_method_registry.xlsx"
}
10.2. Future-compatible metadata shape
If future resolver support is introduced:
{
"id": "compute_method_registry",
"description": "ZAYAZ compute methods registry (schemas, implementations, dependencies).",
"asset_class": "excel",
"file_name": "compute_method_registry.xlsx",
"url": "https://assets.zayaz.dev/excel/compute_method_registry.xlsx",
"sssr_ref": "sssr:asset.docs-assets.compute_method_registry.xlsx"
}
This is compatible with future registry alignment.
11. GitHub vs AWS Asset Rules
11.1. Keep in GitHub if:
- asset is small
- page-coupled
- directly authored with docs
- useful for local editing
- not a bulk source library
Examples:
- small example JSON
- page-specific diagrams
- snippet-associated examples
11.2. Move to AWS if:
- asset is large
- part of a workbook library
- expected to scale
- more archival than editorial
- not needed inside Docusaurus runtime
Examples:
- Excel libraries
- large downloadable registries
- large generated exports
12. Search / Ingestion Compatibility
The current Excel ingestion logic is compatible with AWS-hosted Excel libraries as long as the search/index layer can access:
- the AWS storage location directly
- or a synced local cache
- or a controlled export mirror
This means moving Excel files to AWS does not invalidate the indexing architecture.
13. Anti-Patterns
Avoid:
- raw S3 public URLs in docs metadata
- embedding environment internals in public file paths
- storing large workbook libraries in
docusaurus/static/excel - naming files with spaces or unstable human labels
- leaking bucket structure into user-facing links
- mixing production and dev assets under the same URL root
14. Recommended Initial Rollout
Phase 1
- define canonical asset domain
- define canonical naming rules
- keep current docs metadata structure
- begin replacing GitHub/static Excel links with canonical asset URLs
Phase 2
- move Excel files to AWS storage
- wire Cloudflare asset delivery
- preserve stable public URLs
Phase 3
- introduce richer metadata
- optionally add resolver/registry integration
- support versioned assets where needed
15. Excel → Registry Ingestion Pipeline (ZAYAZ Standard)
15.1. Purpose
This section defines the canonical pipeline for transforming Excel-based source data into:
- structured registries
- machine-readable datasets
- runtime-ready ZAYAZ data assets
This pipeline is foundational for:
- CSRD / ESRS compliance
- auditability
- traceability
- validator engines (ZARA)
- Computation Hub inputs
- sustainability intelligence systems
15.2. Architectural Role
The Excel ingestion pipeline sits between:
Authoring Layer (Excel)
↓
Ingestion Pipeline
↓
Structured Registry Layer
↓
ZAYAZ Runtime (APIs, Computation Hub, Reporting)
15.3. Key Principle
Excel is authoring format, not runtime format
All production systems must consume:
- JSON
- registry objects
- database records
—not raw Excel files.
15.4. Pipeline Stages
Stage 1 — Source (Excel) Location:
https://assets.zayaz.dev/excel/<table>.xlsx
Characteristics:
- human-authored
- versioned
- potentially incomplete
- not validated
Stage 2 — Ingestion Component:
code/infrastructure/zayaz-search-indexer/src/ingest-excel.ts
Responsibilities:
- discover Excel files
- read sheets
- extract:
- table name
- columns
- sample rows
- generate raw document objects
Stage 3 — Validation Layer (NEW — REQUIRED) This layer must be introduced.
Responsibilities:
- schema validation (per table)
- required column enforcement
- data type validation
- controlled vocabularies (e.g. ESRS dimensions)
- referential integrity (future)
Example:
compute_method_registry.xlsx
→ validate against compute-method-registry.schema.json
Failure behavior:
- block pipeline OR
- flag violations (ZARA integration)
Stage 4 — Normalization Transform Excel rows into canonical structures:
{
"id": "compute_method_001",
"name": "Scope 1 combustion",
"unit": "kgCO2e",
"methodology": "IPCC",
"dependencies": ["fuel_type", "quantity"]
}
Normalization includes:
- trimming
- typing
- mapping column names → canonical fields
- resolving enums
- generating IDs if missing
Stage 5 — Registry Generation Output structured registry artifacts:
registry/<table>.json
Example:
registry/compute_method_registry.json
registry/altd_event.json
Location (prod):
https://assets.zayaz.io/registry/<table>.json
Stage 6 — Indexing (Optional / Current) The existing search pipeline:
zayaz-search-indexer
Can consume:
- Excel directly (current)
- OR preferably normalized registry JSON (future)
👉 Recommended evolution:
Index registry JSON, not Excel
Stage 7 — Runtime Consumption Used by:
- ZARA validator engine
- Computation Hub
- Reporting Hub
- ESG analytics
- APIs
15.5. Data Model Contracts
Each Excel table must map to:
| Layer | Contract |
|---|---|
| Excel | Human-readable table |
| Schema | JSON schema definition |
| Registry | Normalized JSON |
| Runtime | API / DB model |
15.6. Schema Governance
Each table should have a schema:
schemas/<table>.schema.json
Example:
schemas/compute_method_registry.schema.json
Schema defines:
- required fields
- types
- enums
- relationships
15.7. Versioning Strategy
Excel (authoring)
- version optional
- managed by editor workflow
Registry (production) Two strategies:
A — Latest only
registry/compute_method_registry.json
B — Versioned
registry/compute_method_registry.v1.0.json
Recommended:
- start with latest
- introduce versioning when needed for audit
15.8. Audit & Traceability (CRITICAL for ESG)
Each registry entry should support:
{
"source": {
"file": "compute_method_registry.xlsx",
"sheet": "Sheet1",
"row": 42
},
"ingested_at": "2026-03-21T12:00:00Z",
"version": "v1"
}
This enables:
- audit trails (CSRD requirement)
- verification workflows
- data lineage tracking
15.9. Integration with ZARA
ZARA should validate:
- schema compliance
- missing required fields
- inconsistent units
- invalid references
Pipeline integration:
Excel → Validation → ZARA violations → Registry
Violations stored in:
docusaurus/static/system/violations.json
15.10. Integration with Computation Hub
Registries become:
- input datasets
- dependency graphs
- compute method definitions
Example:
compute_method_registry → Computation Hub execution models
15.11. Deployment Model
DEV
assets.zayaz.dev/excel/*.xlsx
PROD
assets.zayaz.io/registry/*.json
15.12. Anti-Patterns
Avoid:
- using Excel directly in runtime
- skipping validation
- storing large Excel files in GitHub long-term
- mixing dev and prod asset domains
- bypassing registry layer
15.13. Future Enhancements
Planned evolution:
- SSSR-backed registry identifiers
- Graph-based table relationships
- real-time ingestion pipelines
- validator networks
- signed registry snapshots (audit-grade)
- carbon accounting traceability layers
16. Final Rule
Public asset URLs must be stable.
Storage paths may change.
Documentation must link to the stable contract, not the storage implementation.
Excel is input
Registry is truth
APIs are delivery
This pipeline is the backbone of ZAYAZ’s decision-grade ESG infrastructure.