SPT
Split / Allocation Calculator Micro Engine
1. Identity
Depends on module:
-
Purpose: Splits (allocates) a scalar or time-series total into multiple buckets using either:
- explicit weights, or
- mapping keys referencing a registered allocation map.
-
Typical usage:
- Energy split by carrier (electricity, gas, diesel…)
- Water split by source (surface, groundwater…)
- Waste split by hazard class/type (if using weight-based allocation)
Versioning policy
- MEID is stable.
- Executable builds and versions are governed via ZAR (cmi + semantic version +
execution_ref+build_hash).
2. Contract References (ZAR)
2.1. Input Schema
- ZAR Address:
schema.compute.<domain>.split.inputs.v1_0_0
Required conceptual fields
total: scalar or time-seriesbuckets: array of bucket definitions (name + key)allocation_mode:WEIGHTS|MAP_REFweightsORmap_refdepending on modealignment:BY_YEAR|BY_INDEX(defaultBY_YEAR)
2.2. Options Schema
- ZAR Address:
schema.compute.<domain>.split.options.v1_0_0
Common options:
weights_policy:MUST_SUM_1|NORMALIZE|ALLOW_UNDER|ALLOW_OVERmissing_bucket_policy:ERROR|DROP|OTHER_BUCKETother_bucket_name: default "other"rounding: optional digitspreserve_total_in_metadata: boolean (default true)
2.3. Output Schema
- ZAR Address:
schema.compute.<domain>.split.output.v1_0_0
Outputs include:
bucket_series: map/dict of bucket → values (or list of bucket entries)metadata: sum checks, normalization applied, missing/other bucket behavior, map version used
3. Accepted Input Shapes
A. Scalar total with explicit weights
{
"total": 1000,
"allocation_mode": "WEIGHTS",
"buckets": [
{"key": "electricity", "name": "Electricity"},
{"key": "gas", "name": "Natural gas"}
],
"weights": {
"electricity": 0.6,
"gas": 0.4
}
}
B. Time-series total with weights
{
"total": [[2025, 5000], [2026, 5200]],
"allocation_mode": "WEIGHTS",
"buckets": [{"key": "scope3_cat6"}, {"key": "scope3_cat7"}],
"weights": {"scope3_cat6": 0.3, "scope3_cat7": 0.7},
"alignment": "BY_YEAR"
}
C. Map reference mode (preferred when allocation is defined centrally)
{
"total": [[2025, 5000], [2026, 5200]],
"allocation_mode": "MAP_REF",
"buckets": [{"key": "surface"}, {"key": "groundwater"}],
"map_ref": "ALLOC.WATER.SOURCE.DEFAULT.v1",
"alignment": "BY_YEAR"
}
All supported shapes MUST be normalized internally to a canonical form: Map
<period, total_value>+ Map<bucket_key, weight>.
4. Compute Semantics (Normative)
For each aligned period and each bucket :
4.1. Determine weights
If allocation_mode = WEIGHTS:
- use
weightsfor each bucket
If allocation_mode = MAP_REF:
- resolve
map_refto a registered allocation map (dataset/registry lookup) - extract weights for the requested buckets
4.2. Apply weights policy
Let over requested buckets.
MUST_SUM_1: require within tolerance → else errorNORMALIZE: replace each / S (if )ALLOW_UNDER: allow (unallocated remainder handled by missing policy)ALLOW_OVER: allow (total allocated exceeds total; must be flagged)
4.3. Allocate
4.4. Remainder handling (if )
If and policy allows under-allocation:
- apply
missing_bucket_policy:ERROR: failDROP: ignore remainder but record in metadataOTHER_BUCKET: assign remainder to other_bucket_name
If and policy allows over-allocation:
- record overflow ratio in metadata (and optionally error if configured)
5. Alignment Rules (Normative)
Same alignment semantics as DELTA/SHARE:
alignment = BY_YEAR
- total series is keyed by year (or period)
- if allocation weights are time-constant, they apply to all periods
- if allocation map provides time-varying weights, they MUST be aligned by year
alignment = BY_INDEX
- compare in-order
- series length rules apply per schema/options
Output metadata MUST include:
weights_policy_appliedweights_sumnormalized(boolean)remainder_handlingmap_ref_resolved+ map version/build hash (ifMAP_REF)
6. Validation & Error Model
Invariants
- weights must be finite numbers
- negative weights are invalid unless explicitly allowed (default: invalid)
- if
MUST_SUM_1, enforce tolerance (e.g. ) - total must be finite
Error codes (suggested)
SPLIT_WEIGHTS_MISSING_FOR_BUCKETSPLIT_WEIGHTS_SUM_INVALIDSPLIT_MAP_REF_NOT_FOUNDSPLIT_REMAINDER_POLICY_ERRORSPLIT_INPUT_SHAPE_INVALID
Errors MUST include:
- engine
cmi_short_code - bucket key (where applicable)
- period/key causing failure (where applicable)
7. Dependencies
MEID_CALC_SPLIT depends on:
- time-series normalization utilities
- schema resolver (ZAR)
- allocation map resolver (registry/dataset), if
MAP_REFis used - (optional) unit propagation utilities
Declared in ZAR via dependencies (CMIs).
8. Federation & Audit Requirements
To reproduce a split externally, the bundle MUST include:
- Engine identity (
cmiorzar_code) - Engine build proof (
execution_ref+build_hash) - Schemas used (input/options/output) referenced by
zar_code/cmi - If
MAP_REFused:- the exact allocation map artifact (or dataset snapshot) + its hash/version
- the map_ref resolution metadata captured in output
Provenance MUST show:
… → MEID_CALC_SPLIT → …
with the engine cmi_short_code included in USO tail arrays.
9. Performance Notes
- Complexity: where = periods, = buckets
- Memory: for full output; streaming mode can emit bucket values per period
- Should remain efficient for typical bucket counts (2–50)
10. Methods Served (v1)
- Energy.split_by_carrier
- Water.split_by_source
- Water.split_by_stress_area (when implemented as allocation buckets)
- GHG.scope3.cat.abs (when allocating totals into categories via map/weights)
- Waste splits (if enabled by method registry)