WST-CLASS
Waste Treatment Classification Micro Engine
1. Identity
Depends on module:
Purpose
Normalizes contractor/manifest treatment codes (often inconsistent across regions and vendors) into a canonical treatment taxonomy:
recycledrecovereddisposedunknown
This engine makes MEID_CALC_WASTE_DIV audit-ready by ensuring:
- treatment definitions are explicit and versioned,
- vendor codes are mapped consistently,
- and each line item carries provenance (which mapping row/rule was used).
Typical usage
- Ingest waste contractor statements, manifests, tickets
- Enrich line items with canonical treatment categories
- Feed treatment-tagged flows into
MEID_CALC_WASTE_DIV
2. Contract References (ZAR)
2.1 Input Schema
ZAR Address: schema.compute.waste.treatment_classify.inputs.v1_0_0
Required conceptual fields:
items: list of waste treatment line items to classifymapping_ref: ZAR reference to treatment mapping dataset (defaults allowed)jurisdiction: optional (e.g.,EU,NO,UK) for regional variationsalignment:BY_YEAR|BY_INDEX(defaultBY_YEAR)
Each item conceptually includes:
periodvalueunit(e.g.kg,tonne)- one or more treatment identifiers:
contractor_treatment_code(preferred)manifest_treatment_code(e.g., R/D codes, EWC treatment, local codes)treatment_text(free-text fallback)
- optional:
contractor_idfacility_id(treatment facility)manifest_id/ticket_idsite_idewc_code/ewc_item(if available; helps routing rules)hazard_class(if already classified upstream)
2.2 Options Schema
ZAR Address: schema.compute.waste.treatment_classify.options.v1_0_0
Common options:
match_mode:CODE_ONLY|TEXT_ONLY|AUTO(defaultAUTO)normalize_codes: boolean (default true)use_eu_r_d_rules: boolean (default true)
(if an item has an EU-style R/D code, classify by deterministic rules)unknown_code_policy:ERROR|FLAG|ASSIGN_UNKNOWN|ASSIGN_DISPOSED(defaultFLAG)regional_variations_policy:APPLY_IF_PRESENT|IGNORE(defaultAPPLY_IF_PRESENT)unit_normalization:NO_CONVERT|CONVERT_TO_RECOMMENDED(defaultNO_CONVERT)rounding: optional digits
Unit conversion is delegated; this engine only requests conversion if enabled.
2.3 Output Schema
ZAR Address: schema.compute.waste.treatment_classify.output.v1_0_0
Outputs include:
items_classified: same items enriched with canonical treatment fieldssummary: totals and counts per treatment outcomemetadata: mapping version/hash, unknown handling stats, jurisdiction used
Enriched fields per item:
treatment_canonical:recycled|recovered|disposed|unknowntreatment_family:DIVERTED|NOT_DIVERTED|UNKNOWN(derived)treatment_code_normalizedrecommended_unit(if provided by mapping)classification_confidenceclassification_provenance(mapping_ref + row id + rule path)
3. Canonical Treatment Definitions (Normative)
v1 canonical treatments:
- recycled: material recycling, reprocessing, composting (where treated as recycling by policy)
- recovered: energy recovery (e.g., incineration with energy recovery), other recovery operations
- disposed: landfill, incineration without recovery, permanent storage, deep well injection, etc.
- unknown: treatment not known or not classifiable under policies
Derived treatment family:
DIVERTED=recycledorrecoveredNOT_DIVERTED=disposed
4. Mapping Dataset Contract (treatment_map)
The treatment mapping dataset (ZAR referenced) SHOULD include:
provider/contractor_id(optional; if provider-specific codes)treatment_code(string)treatment_code_normalized(optional precomputed)treatment_text_match(optional regex/keywords)treatment_canonical(recycled|recovered|disposed|unknown)confidence_default(optional)regional_variations(optional json/text)recommended_unit(optional)notes(optional)
This dataset MUST be:
- versioned and immutable once released
- referenced via
mapping_refand recorded in output metadata
5. Classification Semantics (Normative)
Let an input item be .
5.1 Code normalization
If normalize_codes = true, normalize any code input:
- trim spaces
- uppercase
- remove common punctuation
- canonicalize common forms (implementation-specific but deterministic)
Call the normalized treatment code .
5.2 Matching strategy
If match_mode:
CODE_ONLY: use code-based matching onlyTEXT_ONLY: use text-based matching onlyAUTO:- if
use_eu_r_d_rulesand an R/D code is detected → apply deterministic rule (below) - else attempt mapping dataset code match
- else attempt mapping dataset text/keyword match
- else unknown policy
- if
5.3 EU R/D deterministic rules (if enabled)
If item includes an EU-style operation code:
R1–R13→recoveredby default
(with an exception list: some Rs can be considered recycling if policy defines it)D1–D15→disposed
v1 conservative defaults:
R1→recovered(energy recovery)R2–R9→recycled(material recovery operations)R10→recycled(land treatment beneficial to agriculture/ecology)R11→recycled(use of wastes obtained from R1–R10)R12–R13→recovered(exchange/storage pending recovery)
This split can be policy-controlled later; v1 should record which R-code mapping table was used in metadata.
5.4 Dataset mapping
If a mapping row is found:
- assign
treatment_canonical(i) = treatment_canonical(m(i)) - copy recommended unit if present
5.5 Unknown handling
If no match exists, apply unknown_code_policy:
ERROR: failFLAG(default): assignunknownbut flag the itemASSIGN_UNKNOWN: assignunknownASSIGN_DISPOSED: assigndisposed(conservative)
6. Outputs & Totals (Convenience)
This engine is primarily an enrichment transformer, but v1 may optionally return per-period totals for convenience:
For each period :
Where .
These totals are directly consumable by MEID_CALC_WASTE_DIV.
7. Examples
Example A — Contractor code mapping
{
"mapping_ref": "DATASET.WASTE.TREATMENT_MAP.v1",
"jurisdiction": "EU",
"items": [
{ "period": 2025, "value": 600, "unit": "tonne", "contractor_treatment_code": "REC-MAT" },
{ "period": 2025, "value": 150, "unit": "tonne", "contractor_treatment_code": "R1" },
{ "period": 2025, "value": 400, "unit": "tonne", "contractor_treatment_code": "D1" }
]
}
Example B — Free-text fallback
{
"mapping_ref": "DATASET.WASTE.TREATMENT_MAP.v1",
"items": [
{ "period": 2025, "value": 10, "unit": "tonne", "treatment_text": "sent to landfill" },
{ "period": 2025, "value": 5, "unit": "tonne", "treatment_text": "incineration with energy recovery" }
],
"match_mode": "AUTO"
}
8. Validation & Error Model
Invariants
- items must be non-empty
- each item must provide at least one treatment identifier (code or text)
- mapping dataset must resolve if code/text matching is required
- values must be finite
Error codes (suggested)
WASTE_TREAT_MAP_NOT_FOUNDWASTE_TREAT_ITEM_MISSING_TREATMENT_KEYWASTE_TREAT_UNKNOWN_CODE_ERRORWASTE_TREAT_INVALID_CODE_FORMATWASTE_TREAT_NON_FINITE_VALUE
Errors MUST include:
- engine
cmi_short_code - item index + manifest/ticket id + offending code/text sample
9. Dependencies
MEID_TRANS_WASTE_TREATMENT_CLASSIFY depends on:
- schema resolver (ZAR)
- mapping dataset resolver (
mapping_ref) - optional unit conversion capability (delegated)
Declared via ZAR dependencies.
10. Federation & Audit Requirements
To reproduce treatment classification externally, the export MUST include:
- engine identity (
cmiorzar_code) - engine build proof (
execution_ref+build_hash) - mapping dataset reference/version/hash (
mapping_ref) - jurisdiction/options used (R/D rules, unknown handling)
- per-item provenance (matched row / rule path)
Provenance chain MUST show:
… → MEID_TRANS_WASTE_TREATMENT_CLASSIFY → MEID_CALC_WASTE_DIV → …
with cmi_short_code recorded in USO tail arrays.
11. Performance Notes
- Complexity: over items (hash-map lookup for code matches)
- Memory: output size
- Mapping datasets should be cached by
mapping_refper worker
12. Methods Served (v1)
Waste.item.treatment_classifiedWaste.treatment.recycled.absWaste.treatment.recovered.absWaste.treatment.disposed.absWaste.treatment.unknown.abs