IoT-DI
IoT Device Inference
1. ZAYAZ IoT Device Inference – V1 Sketch
V1 Design Goal (non-negotiable)
Reduce IoT data-source onboarding friction by ≥50% while increasing provenance transparency.
Not “perfect classification”. Not “magic AI”. Faster, safer, auditable onboarding.
1. V1 Scope (what we deliberately include / exclude)
✅ Included in V1
- Probabilistic device category inference (not model-level)
- Entropy-based + metadata-based features only
- Prefill of limited, low-risk fields
- Explicit user confirmation / override
- Full provenance & audit trail
- Stateless inference engine (easy to scale)
❌ Explicitly excluded from V1
- Deep packet inspection
- Automatic compliance mapping
- Autonomous learning loops
- “Black box” ML
- Any inference driving reporting without confirmation
This keeps V1 safe, fast, and credible.
2. V1 Target Outputs (what the system actually produces)
Primary Output
IoT Device Profile – Draft
Fields prefilled with confidence:
| Field | Prefill? | Notes |
|---|---|---|
| Device category | ✅ | e.g. Sensor / Meter / Camera / Gateway |
| Sub-category | ⚠️ (Top-3) | e.g. Temperature / Energy / Occupancy |
| Expected data cadence | ✅ | periodic / event-driven / bursty |
| Expected unit family | ⚠️ | energy / environmental / binary events |
| Data risk flag | ✅ | low / medium / anomalous |
| AI confidence | ✅ | 0–1 |
| Evidence link | ✅ | mandatory |
Everything else remains manual in V1.
3. Core Component: MEID-IOT-V1 (Micro-Engine)
Purpose
Generate a provenance hypothesis for an unknown IoT data stream.
Inputs (minimal & realistic)
- Flow metadata (NetFlow-like)
- Timestamped packet sizes
- Destination domains / IPs
- Protocol/port hints (no payload parsing)
- Optional MAC OUI (if available)
Feature Set (V1)
Entropy Features
- Payload size entropy
- Inter-arrival time entropy
- Destination entropy
- Session duration variance
Structural Signals
- Periodicity score
- Burstiness index
- Endpoint stability score
- Avg bytes / minute
Light Identity Hints
- MAC OUI → vendor family (optional)
- Domain pattern match (vendor clouds)
⚠️ All features are non-PII and privacy-safe.
4. Inference Logic (V1 = transparent, not fancy)
Model choice (recommended)
- Rule-weighted Bayesian classifier
- Human-readable priors
- Easy to tune
- Explainable
Example (simplified):
IF low time entropy
AND low size entropy
AND single stable endpoint
→ P(sensor) ↑↑
IF high size variance
AND burst traffic
AND high destination entropy
→ P(camera/gateway) ↑↑
Output
{
"predictions": [
{"label": "Environmental Sensor", "p": 0.82},
{"label": "Energy Meter", "p": 0.11},
{"label": "Gateway", "p": 0.07}
],
"confidence": 0.82,
"model_version": "MEID-IOT-V1.0"
}
5. FOGE Integration: Prefill with Guardrails
UI Behaviour (critical)
- Fields show “AI-suggested” badge
- Confidence shown inline
- “Why?” button reveals evidence summary
- “Change” opens dropdown with Top-3 + manual search
Hard Rule
Inference may prefill forms, but never auto-lock fields.
This aligns with ZAAM trust principles.
6. Override = First-Class Data Asset
Every override creates:
Inference Event
→ User Override
→ Confirmed Device Profile
Stored with:
- Old prediction
- New label
- Confidence delta
- Timestamp
- Tenant context
This becomes future training data, but:
- Not auto-used
- Only via governed retraining cycle
7. Data Model Additions (SSSR-aligned)
New entity (V1 minimal)
iot_source_inference
- inference_id
- source_id
- predicted_labels[]
- confidence
- evidence_refs[]
- model_version
- created_at
Extend existing source entity
- confirmed_device_category
- confirmation_method = MANUAL | AI_ASSISTED
- confirmation_timestamp
This is enough for audit and scale.
8. KPIs for V1 (decide success early)
The followingh should be measured from day one:
| KPI | Target |
|---|---|
| Avg onboarding time | −50% |
| % AI-assisted registrations |