Standard Naming Policy
Below is a data architecture naming guide, tailored to ZAYAZ. It determines:
- How understandable the system is to new engineers
- How scalable it becomes
- How natural it is to map signals → engines → modules → data products
- How easy it is for AI Agents to reason about the data graph
⸻
Industry-Standard Families of Table Prefixes
1. dim_ — Dimension Tables
Static or slow-changing entities.
Examples:
- dim_countries
- dim_units
- dim_sectors
- dim_products
⸻
2. fact_ — Fact Tables
Event or transaction tables with measures.
Examples:
- fact_emissions
- fact_energy_usage
- fact_risk_events
⸻
3. ref_ — Reference Tables
Small enumerations, taxonomies, lookup lists.
Examples:
- ref_signal_types
- ref_nace_codes
- ref_emission_factors
- ref_units
Ref tables are similar to dims but usually have no SCD logic, and are often manually curated.
⸻
4. stg_ — Staging Tables
Raw ingested data, minimally cleaned.
Examples:
- stg_efdb_raw
- stg_country_codes_raw
Perfect for intermediate imports from Excel or APIs.
⸻
5. int_ — Intermediate Tables
Transformation layers between staging and fact/dim.
Examples:
- int_emissions_enriched
- int_products_normalized
Good fit for ZAYAZ engines that merge multiple sources.
⸻
6. agg_ — Aggregated Tables
Pre-aggregated or summary data for performance.
Examples:
- agg_emissions_by_country
- agg_risk_by_nace
Perfect for FIRM (Financial Impact & Risk Modeling).
⸻
7. mrt_ — Mart Tables (Domain-specific)
Subject-area optimized tables used by downstream apps.
Examples:
- mrt_ghg_reporting
- mrt_sustainability_scores
- mrt_circularity_kpis
⸻
8. tmp_ or _tmp — Temporary Tables
Used by pipelines, removed after processing.
⸻
9. rl_ — Relation / Bridge Tables
Used to connect many-to-many relationships.
Examples:
- rl_product_nace
- rl_signal_source
Useful for signal routing across engines.
⸻
10. gold_, silver_, bronze_ (Delta/Lakehouse convention)
A popular alternative naming system:
- bronze → raw ingestion (like stg_)
- silver → cleaned, normalized (like int_)
- gold → analytics-ready (like fact_, dim_, agg_)
This is modern, elegant, and widely adopted.
⸻
11. eng_ — Engine Output Tables
When an engine emits structured tabular outputs.
Examples:
- eng_pef_io
- eng_zhif_hazards
⸻
12. mod_ — Module Output Tables
When a module (FIRM, COSE, RIF, etc.) produces transformed datasets.
Examples:
- mod_firm_scenario_risk
- mod_cose_violations
⸻
13. sig_ — Signal Definitions
Signals registry dataset.
Examples:
- sig_registry
- sig_emissions_v1
⸻
Summary Table of All Recommended Prefixes
| Prefix | Meaning | ZAYAZ Usage |
|---|---|---|
| dim_ | Dimensions | Countries, Units, Sectors |
| fact_ | Facts (events) | Emissions, indicators |
| ref_ | Reference data | EFDB, NACE |
| stg_ | Staging | Raw Excel/API imports |
| int_ | Intermediate | Engine merge outputs |
| agg_ | Aggregates | KPI rollups |
| mrt_ | Data marts | Domain-tailored outputs |
| tmp_ | Temporary | Pipeline intermediates |
| rl_ | Relations | Many-to-many links |
| eng_ | Engine outputs | PEF, ZHIF results |
| mod_ | Module outputs | FIRM, COSE |
| sig_ | Signals registry | Signal definitions |
| bronze _ / silver _ / gold_ | Lakehouse layers | Optional modern style |
⸻
Naming structure for signals ("family")
- "family": "audit" → audit_id, timestamp, created_by
- "family": "id" → iso_cc_id, nace_id, product_id
- "family": "code" → iso_cc, nace_code, unit_code
- "family": "metric" → co2e_kg, energy_kwh, risk_score
- "family": "classification" → risk_level, material_class
- "family": "geo" → country, region, lat, lon
- "family": "time" → period_start, period_end, year
- "family": "flag" → is_primary, is_estimated
- "family": "text" → description, comment
Here is a clean, future-proof, compact Status Vocabulary tailored for ZAYAZ’s signal registry, engine registry, and module registry.
This set is intentionally minimal, unambiguous, and works well for lifecycle governance.
⸻
ZAYAZ Standard Status Vocabulary
1. draft
The signal/module/engine is being designed. Structure may change.
Use when:
- Something is newly introduced
- Not yet ready for consumption
- Needs internal validation
⸻
2. experimental
The definition is functional but not yet stable.
Use when:
- Used in prototypes or edge engines
- Changes are expected
- You want developers to test it, but not rely on it for production
⸻
3. stable
Recommended and actively supported.
Use when:
- Fully documented
- Schema validated
- Used across multiple engines/modules
- ackwards-compatible guarantees apply
This will be the most common value.
⸻
4. legacy
Still present, but scheduled for phase-out.
Use when:
- Existing systems depend on it
- Can’t remove yet
- Should not be used for new development
⸻
5. deprecated
No longer recommended. Will be removed.
Use for:
- Hard phase-outs
- When a replacement exists
- Requires a migration path
⸻
6. retired
Removed from active use and will not return.
Use when:
- Fully removed from code/engines
- Kept in registry only for historical audit purposes
⸻
Default Vocabulary set for ZAYAZ (compact list)
| Status | Meaning |
|---|---|
| draft | Initial definition; still in progress. |
| experimental | Working but unstable; may change. |
| stable | Fully supported and recommended. |
| legacy | Old but still needed for compatibility. |
| deprecated | Avoid use; replacement exists; removal planned. |
| retired | Removed; only present for audit/history. |
⸻