Jira progress: loading…
AIIL-DIA
Architecture Diagrams & Sequence Flows
1. Canonical Diagrams and Sequence Flows
1.1. Purpose & Scope
This chapter provides canonical diagrams and sequence flows for ZAYAZ’s AI system. It’s the visual contract that binds:
- AI behavior → retrieval, adapters, compute, packaging
- Governance → version pinning, RBAC/ACL, audit & provenance
- Reliability → SLO enforcement, canary paths, rollback lanes
- Jurisdiction → EU-only routing, framework allowlists
Artifacts here are normative: if code and diagrams diverge, raise a change request.
1.2. High-Level System Map
Governance anchors
- Determinism: (method_id, version) on compute paths
- Format control: (regex_name, version) on validation paths
- Traceability: provenance_id across all logs and exchanges
1.3. End-to-End Q&A Flow (LLM → Disclosure)
SLO guard: if p95>target in C or I, the SLO gate triggers circuit breakers (fallback retrieval, cached answers, or structured refusal).
1.4. Retrieval & Behavior Adapter (RAG) internals
Controls
- Context-only: LLM is instructed to answer from retrieved context with paragraph-level citations.
- Adapter: Standardizes tone (“compliance”), answer shape, and invokes compute methods where numbers are requested.
1.5. Compute Path (Deterministic)
Invariants
- Requests must include (method_id, version).
- Input/Output JSON Schemas are enforced pre/post.
- Audit row references dataset artifact hashes.
1.6. Packaging & Exchange (AI Boundary)
Blockers
- Regex/Extractor FAIL → refusal or re-prompt; no package is produced.
- ACLs applied to export targets (e.g., ESRS-only tenant cannot submit SEC).
1.7. Failure, Canary, and Rollback Flows
-
Retrieval degradation
- Symptom: ANN latency spike.
- Action: switch to BM25-only fallback; cap context size; surface SLO tag in answer footer.
- Audit: fallback_mode=RAG_BM25_ONLY, provenance preserved.
-
Compute method hotfix
- Release v1.1.1 alongside v1.1.0.
- Canary 5% of tenants via Helm allowlist.
- If Eval gates pass → promote compute_method_latest to 1.1.1.
- Rollback: flip latest back; requests with explicit 1.1.0 remain deterministic.
-
Regex library tightening
- New pattern in regex_library v1.4.0.
- Shadow runs on staging → approve → promote.
- If refusal rate > threshold, auto-revert to v1.3.0 via feature flag.
1.8. Jurisdiction & Framework Routing
- Router uses tenant allowlist table (frameworks, regions).
- Retrieval filters by pack IDs; adapters constrain prompts accordingly.
- Packaging blocks export outside approved frameworks.
2. Data Residency & Network Boundaries
- EU-only data plane: indices, datasets, execution logs in EU regions.
- Egress controls: regulator/verifier endpoints must be EU-hosted or pass a residency exception with governance approval.
- Key management: per-tenant keys; signatures performed by HSM-backed signer.
3. Deployment Topology (K8s)
- Namespaces: rag, compute, packaging, governance, exchange
- Workloads: rag-gateway (ingress, authN/Z), retriever, ranker compute-api (FastAPI), compute-exec (workers, HPA) packager (JSON/XBRL), signer (HSM sidecar) rbac-svc, audit-writer, eval-harness
- Autoscaling: HPA on compute-exec (p95 latency), retriever (QPS)
4. Observability Hooks (quick refs)
-
RED metrics: requests, errors, duration (per service)
-
Golden signals: retrieval_p95, compute_p95, packaging_error_rate
-
SLIs:
- answers_with_all_citations / total_answers
- answers_passing_regex / total_answers
- answers_with_provenance_id / total_answers
-
Alerts:
- regex_fail_rate > 2% over 15m
- compute_p95 > 600ms over 10m
- packaging_sign_failures > 0 over 5m
5. Reference Legends (for all diagrams)
- Determinism: (method_id, version) required on compute
- Format control: (regex_name, version) required on validator
- Traceability: provenance_id stitched through retrieval → compute → packaging → exchange
- Jurisdiction: acl_tags enforced at compute and export