ASK-ZARA
Internal AI assistant
Ask ZARA is the internal AI assistant for the ZAYAZ platform.
It helps developers, architects, and auditors understand the platform by answering questions using the full semantic index of the ZAYAZ documentation, code examples, registries, schemas, and relationships.
Ask ZARA is designed specifically for complex architecture reasoning, impact analysis, and developer debugging workflows.
Note: Ask ZARA v2 introduced architecture-aware retrieval mechanisms that expand documents based on metadata relationships, clusters, and dependency signals. These capabilities dramatically improve architecture explanation and impact analysis.
Ask ZARA System Architecture
1. Purpose
The ZAYAZ platform contains:
- micro-engines (MICE)
- schema registries
- signal registries
- Excel specifications
- documentation
- associated code examples
- table relationships
- governance pipelines
Ask ZARA provides a conversational interface that can reason across all of these sources.
Typical use cases:
- understanding how engines interact
- tracing data flows across the platform
- debugging architecture issues
- performing impact analysis before changing registries or engines
- exploring platform subsystems
- supporting internal audits and verification
Ask ZARA is intended for:
- ZAYAZ developers
- system architects
- platform maintainers
- auditors / verifiers
It is not intended for end users or clients.
2. Models
Ask ZARA supports multiple reasoning modes.
| Mode | Model | Purpose |
|---|---|---|
| Fast | gpt-4.1-mini | Default. Quick answers and documentation lookup |
| Deep reasoning | gpt-4.1 | Cross-document synthesis |
| Architect reasoning | gpt-5.x | Complex architecture analysis |
The interface allows switching model mode temporarily.
Fast mode remains the default for cost and speed.
3. How Ask ZARA Works
3.1 Ask ZARA Reasoning Pipeline
Ask ZARA transforms raw search results into structured architecture context before invoking the reasoning model. This allows the system to explain subsystems, trace dependencies, and perform impact analysis across the ZAYAZ platform.
Ask ZARA uses a retrieval-augmented architecture.
Pipeline:
Ask ZARA does not reason over flat search hits; it builds structured architecture paths from retrieved documentation, code, schema, and registry context before invoking the selected reasoning model.
The system retrieves documentation and system data from the ZAYAZ search index.
3.2. The ZAYAZ Knowledge Graph inside Ask ZARA
The ZAYAZ platform forms a knowledge graph of interconnected system primitives including engines, signals, tables, schemas, registries, documentation, and code.
Ask ZARA indexes these elements and reconstructs architecture paths that allow the reasoning model to analyze subsystem relationships and perform impact analysis across the platform.
3.3. Dependency-Aware Retrieval (Architecture Graph Expansion)
3.4. The ZAYAZ Index Graph Inside index.jsonl
What You’re Looking At
Ask ZARA does not search documents — it navigates an architecture graph.
Although the index is stored as a flat index.jsonl file, each entry represents a node in the ZAYAZ architecture ecosystem: documentation sections, engines, signals, tables, schema fields, and code fragments.
Metadata such as used_by_engines, table_name, classification, and architecture_layer creates relationships between these nodes, forming a lightweight knowledge graph.
This allows Ask ZARA to follow architecture paths like:
engine → signal → table → schema → downstream engine
Because of this structure, Ask ZARA can answer questions like:
- Which engines depend on this signal?
- What breaks if this API contract changes?
- Where is this schema field used?
Instead of retrieving isolated text fragments, Ask ZARA performs architecture-aware retrieval and reasoning across the ZAYAZ platform.
3.4.1. Diagram Reading Guide
The diagram above illustrates how the Ask ZARA search index behaves as a lightweight architecture knowledge graph. Although the underlying storage format is a flat index.jsonl file, each indexed entry represents a node in the ZAYAZ architecture ecosystem.
Node Types
Each box in the diagram represents a type of indexed node.
| Node Type | Description |
|---|---|
| Engine / Doc Section | Sections from system documentation or engine specifications. |
| Associated Code | Code snippets or implementation files linked to a documentation section. |
| Signal | System signals used for communication between engines and modules. |
| Table | Data tables derived from registries, Excel sheets, or schemas. |
| Schema Field | Individual fields inside schemas or tables. |
| Relationship | Structural links between tables such as foreign keys or logical associations. |
These nodes form the core architecture entities that Ask ZARA reasons about.
Structural Links
Solid arrows in the diagram represent structural relationships extracted during indexing.
Examples include:
- an engine produces or consumes signals
- a document describes a signal or table
- a table contains schema fields
- tables are connected via relationships
These links allow Ask ZARA to trace architecture paths such as:
engine → signal → table → schema field
or
table → relationship → downstream tables
Metadata Overlays
The dashed arrows represent metadata overlays that enrich each node.
Examples include:
architecture_clusterarchitecture_layerclassificationtags
These metadata attributes help Ask ZARA:
- group related subsystems
- filter results by architectural layer
- prioritize relevant entities during reranking
- build higher-level explanations of platform components
Why This Matters
Because the indexed data includes both content and relationships, Ask ZARA can go beyond simple document retrieval.
Instead of answering questions by quoting text alone, it can analyze how components interact across the ZAYAZ platform.
This enables advanced queries such as:
- Which engines depend on this signal?
- What breaks if this table schema changes?
- Which modules consume this API contract?
In effect, the index becomes a queryable architecture map that the reasoning model can explore.
4. Indexed Sources
The search index contains multiple structured sources.
| Source | Description |
|---|---|
| MDX documentation | Platform design documentation |
| Associated code | Code examples linked to documentation |
| Explicit snippets | Code files referenced directly in docs |
| JSON registries | signals, schema fields, relationships |
| Excel registries | signal registries and other structured specifications |
Each indexed document includes metadata such as:
kindlink_grouparchitecture_clusterarchitecture_layerclassificationtags
This metadata enables architectural reasoning.
5. Document Kinds
The index assigns a kind to each document.
| Kind | Meaning |
|---|---|
doc_page | MDX documentation page |
doc_section | MDX section |
associated_code | Code example linked to documentation |
excel_sheet | Excel specification sheet |
schema_field | Database schema field |
signal | Signal definition |
table_relationship | Relationship between tables |
This allows ZARA to distinguish between:
- design documentation
- implementation examples
- structured system data
6. Architecture Clustering
Ask ZARA groups related documents using architecture clusters.
Cluster sources:
link_group(engine or subsystem)- table name
- documentation slug root
Example:
architecture_cluster: aae
This groups together:
- engine documentation
- associated code
- related signals
- schema definitions
- table relationships
This enables ZARA to reason about entire subsystems instead of isolated snippets.
7. Architecture Layers
ZARA also extracts architecture layers from documentation tags.
Examples:
architecture_layer: tier-0, assurance, governance
These layers describe conceptual placement within the platform.
Typical layers include:
- tier-0 / tier-1 / tier-2 / tier-3
- governance
- assurance
- validation
- trust
- mice
- meta-signal
Layers help ZARA explain how subsystems fit into the overall platform architecture.
8. Engine Classification
Micro-engines are classified by architectural role.
Examples:
classification: Assurance-Engine
classification: Contract-Engine
classification: Validation-Engine
Classification helps ZARA reason about:
- deterministic engines
- transformation engines
- validation logic
- orchestration engines
9. Architecture Path Context
Ask ZARA groups retrieved documents into architecture paths before passing them to the LLM.
Example context structure:
Architecture path: aae
- Documentation
- Implementation
- Tables and schema
- Signals
- Relationships
This allows the model to understand end-to-end subsystem structure.
The result is significantly better architecture explanations.
10. Conversation-Aware Retrieval
Ask ZARA also uses recent conversation history when retrieving documents.
This allows multi-step debugging conversations like:
User: Explain the Autonomous Assurance Engine User: How does it interact with validation rules? User: What would break if we modify those rules?
The retrieval query incorporates earlier user questions to preserve context.
11. Impact Analysis
Ask ZARA is optimized for debugging and change analysis.
When users ask questions like:
What breaks if we change compute_method_registry?
ZARA attempts to trace:
- upstream inputs
- dependent tables
- signals
- downstream consumers
- affected engines
This helps developers safely evolve the platform.
12. Example Questions
Typical developer questions include:
How does PEF-ME connect to other micro-engines in the computation hub?
Explain the assurance pipeline and how AAE interacts with validation.
What breaks if compute_method_registry changes?
Which tables are produced by the GHG aggregation engines?
How does the Tagged Accounting Crawler feed the reporting pipeline?
13. Interface
The Ask ZARA interface provides:
- conversational chat
- model selection
- source links for every answer
- TODO tag search
- architecture-aware responses
Each answer displays:
answered with Deep reasoning (gpt-4.1)
and includes links to source documentation.
14. Future Improvements
Potential future upgrades include:
- architecture graph traversal
- dependency chain visualization
- subsystem diagrams
- registry diff analysis
- automated architecture validation
15. Version
Current version:
Ask ZARA v2
Key capabilities:
- linked retrieval
- architecture path grouping
- architecture layers
- subsystem clustering
- conversation-aware retrieval
- model switching
- impact analysis prompting
Ask ZARA is designed to make the extremely complex ZAYAZ architecture understandable and navigable for engineers.
Here is a clean MDX section you can append to the previous page. It explains the architecture clearly and includes a diagram-friendly layout suitable for Docusaurus.
You can place this after section 3 or at the end of the page.
Ask ZARA Architecture Overview
Ask ZARA is built as a retrieval-augmented architecture reasoning system tailored specifically for the ZAYAZ platform.
It combines:
- structured platform documentation
- indexed code examples
- registry and schema metadata
- semantic search
- architecture-aware context generation
- LLM reasoning
The goal is to allow developers to explore and debug the ZAYAZ architecture conversationally.
High-Level Architecture
The architecture pipeline performs multiple steps before generating a response.
System Components
- Documentation Indexer
The ZAYAZ Search Indexer scans and processes platform documentation and structured registries.
Sources indexed include:
- MDX documentation
- associated code examples
- explicit snippet files
- JSON registries
- Excel registries
- schema definitions
- table relationships
Each indexed document receives metadata such as:
kind
link_group
architecture_cluster
architecture_layer
classification
tags
These metadata fields enable architecture-aware retrieval.
- Embedding Generation
Each indexed document is converted into a semantic embedding using:
text-embedding-3-small
Embeddings are stored in the search index and used for similarity search.
- Vector Search
When a developer asks a question:
- The question is embedded
- The vector index is searched
- The most relevant documents are retrieved
Example retrieval results might include:
- documentation sections
- code examples
- signals
- schema fields
- table relationships
- Linked Retrieval Expansion
Retrieved documents are expanded using structural metadata.
Expansion sources include:
link_group(engine / subsystem)table_nameschema_fieldsignaltable_relationship
This step ensures that ZARA sees complete architectural slices, not isolated snippets.
- Lightweight Reranking
Retrieved documents are reranked based on:
- semantic similarity
- document kind
- architectural relevance
For example:
- documentation sections rank above snippets
- explicit code examples rank above loosely associated code
- Architecture Path Builder
Documents are grouped into architecture paths before being sent to the LLM.
Example:
Architecture path: aae
Documentation
Implementation
Tables and schema
Signals
Relationships
This step allows ZARA to reason about entire subsystems.
- LLM Reasoning
The structured architecture context is sent to the selected model.
Typical models:
| Mode | Model |
|---|---|
| Fast | gpt-4.1-mini |
| Deep reasoning | gpt-4.1 |
| Architect reasoning | GPT-5 |
The prompt instructs the model to:
- prioritize ZAYAZ documentation
- analyze architecture paths
- perform impact analysis
- reference sources explicitly
- Response Generation
The final response includes:
- a structured explanation
- architecture reasoning
- relevant documentation references
- source links
Example:
answered with Deep reasoning (gpt-4.1)
Sources appear directly beneath the answer.
Key Design Principles
Ask ZARA is built around several architectural principles.
Architecture First
ZARA is optimized for understanding system architecture, not just answering documentation queries.
Linked Knowledge
Documentation, code, schemas, and registries are linked into a unified knowledge graph.
Structured Context
Architecture paths and metadata allow the LLM to reason about system structure.
Developer-Oriented Reasoning
ZARA prioritizes:
- debugging workflows
- architecture exploration
- impact analysis
- cross-engine reasoning
Example Developer Workflow
A typical debugging session may look like:
Developer:
Explain the Autonomous Assurance Engine.
Developer:
How does it interact with validation rules?
Developer:
What breaks if we change compute_method_registry?
Ask ZARA retrieves relevant architecture paths and traces:
- upstream dependencies
- downstream consumers
- related engines
- schema relationships
This allows developers to safely evolve the platform.
System Summary
Ask ZARA v2 combines:
- semantic search
- architecture clustering
- subsystem path grouping
- conversation-aware retrieval
- multi-model reasoning
This architecture allows developers to navigate and debug the complex ZAYAZ platform through natural language.
Indexing Architecture
Before Ask ZARA can answer questions, the ZAYAZ platform documentation and registries are converted into a semantic architecture index.
This indexing process runs offline and produces the index.jsonl file used by the search API.
Indexing Steps
- Source Collection
The indexer scans the ZAYAZ documentation repository and collects structured sources:
- MDX documentation
- associated code examples
- explicit code snippets
- JSON registries
- Excel registries
- schema definitions
- table relationships
These represent the design, implementation, and data model layers of the platform.
- Document Chunking
Large documents are split into smaller semantic chunks such as:
- documentation sections
- schema field definitions
- signal definitions
- code example blocks
Chunking ensures that semantic search retrieves precise architecture fragments rather than entire documents.
- Metadata Enrichment
Each chunk receives structured metadata, including:
kind
slug
title
heading
link_group
architecture_cluster
architecture_layer
classification
tags
This metadata allows the retrieval engine to reason about system structure.
- Embedding Generation
Each chunk is converted into a vector embedding using:
text-embedding-3-small
Embeddings capture semantic meaning and enable similarity search.
- Index Generation
All documents and embeddings are written to the index:
data/index.jsonl
Each line contains a fully indexed document:
content
metadata
embedding
- Search API
The ZAYAZ Search API loads the index and provides endpoints for:
- semantic search
- TODO tag discovery
- architecture-aware retrieval
- Ask ZARA queries
Relationship to Ask ZARA
The index provides the knowledge layer for Ask ZARA.
Ask ZARA then adds:
- linked retrieval
- architecture path grouping
- conversation-aware retrieval
- LLM reasoning
Together, these components allow ZARA to perform deep architectural reasoning across the entire ZAYAZ platform.
Future Extensions — Tool Invocation
When we talk about “tools” for Ask ZARA we mean:
Server-side capabilities that the model can invoke as functions to do things for the user — not just answer from static context.
Examples:
- Search tool —
searchDocs(query,filters) (even richer than what we already do implicitly). - Spec explorer tool —
getSpecBySlug (slug)orlistEngines()(e.g. “show me all micro-engines that use PEF-ME”). - Engine introspection tool —
getEngineSpec(engineId)(parse the engine’s spec file into structured metadata). - Template/tooling tool — e.g.
getRFCTemplate(type)(return a template for change requests, data contracts, etc).
Under the hood that means:
- We keep using the same logs, but:
- We tag tool calls in logs for observability.
- We may add structured tool-call logs for debugging.
- Tools typically operate over:
- The existing search index (fast, stable).
- The file system / repo (for structured spec parsing or JSON registries).
- Optional metadata stores (e.g. per-engine metadata, memory store, etc).
Architecture Context Construction
A key capability of Ask ZARA is its ability to transform raw search results into architecture-aware context before sending them to the LLM.
Instead of presenting a flat list of snippets, ZARA builds architecture paths that reflect the structure of the ZAYAZ platform.
This allows the model to reason about subsystems, engines, schemas, and signals as connected components.
Architecture Path Structure
Each architecture path groups documents that belong to the same subsystem.
Example structure:
Architecture path: aae
Documentation
- Autonomous Assurance Engine overview
- Validation interaction documentation
Implementation
- aae/validator.ts
- governance-check.ts
Tables and schema
- assurance_registry
- validation_rule_registry
Signals
- assurance_score
- governance_violation_flag
Relationships
- validation_rule_registry → assurance_registry
This representation allows the model to see how components interact inside the subsystem.
Context Metadata
Each document block in the architecture path contains structured metadata:
slug: /micro-engines/aae
kind: doc_section
architecture_cluster: aae
architecture_layer: tier-0, assurance, governance
classification: Assurance-Engine
tags: mice, assurance, validation, governance, trust
These signals allow ZARA to reason about:
- subsystem boundaries
- architectural layers
- engine roles
- schema dependencies
Why Architecture Paths Matter
Without architecture paths, the model would see something like:
doc snippet
code snippet
schema field
signal
documentation paragraph
This flat structure makes it difficult for the model to understand system relationships.
With architecture paths, ZARA sees:
Subsystem: Autonomous Assurance Engine
documentation
implementation
tables
signals
relationships
This dramatically improves the model’s ability to perform:
- architecture explanations
- cross-engine reasoning
- debugging analysis
- impact analysis
Example Developer Question
Example debugging query:
What breaks if we modify compute_method_registry?
Ask ZARA will attempt to trace:
- engines that reference the registry
- signals produced by those engines
- tables storing those signals
- downstream reporting pipelines
The architecture path structure makes this reasoning possible.
Summary
Architecture paths are the core mechanism that allows Ask ZARA to move beyond simple document search and perform architecture-aware reasoning across the ZAYAZ platform.
Combined with linked retrieval, subsystem clustering, and architecture layers, this approach enables developers to explore the platform in a way that mirrors how engineers think about complex systems.
Design Philosophy of Ask ZARA
Ask ZARA was designed to support developers working inside a large, deeply interconnected platform. Traditional documentation search is insufficient for such systems because the most important information is often distributed across multiple documents, code examples, schema definitions, and registries.
Instead of treating documentation as isolated pages, Ask ZARA treats the ZAYAZ platform as an architecture graph.
The system therefore focuses on three core principles:
Architecture-aware retrieval
ZARA retrieves related information across:
- documentation
- implementation examples
- schema definitions
- signal registries
- table relationships
This ensures that answers reflect how components interact, not just how they are described individually.
Subsystem context
Rather than presenting flat search results, ZARA constructs architecture paths that group information belonging to the same subsystem or micro-engine cluster.
This allows the model to reason about:
- engine responsibilities
- upstream and downstream dependencies
- shared primitives and tier relationships
- signal and table flows
Retrieval before reasoning
ZARA is intentionally built as a retrieval-first system.
The LLM is not expected to know the ZAYAZ platform.
Instead, it is given structured architectural context extracted directly from the platform documentation and registries.
The model's role is therefore to:
- synthesize the retrieved architecture
- explain system behavior
- trace dependencies
- assist with debugging and design reasoning
Designed for engineers
Ask ZARA is not a general chatbot.
It is an engineering reasoning assistant for the ZAYAZ platform.
Typical use cases include:
- understanding complex engine interactions
- tracing signal or table dependencies
- evaluating the impact of architectural changes
- identifying potential overlap between engines
- assisting with debugging and refactoring
By combining semantic retrieval with architecture-aware context construction, Ask ZARA enables developers to explore the platform in the same structural way that the system itself is designed.
APPENDIX A - Ask ZARA Capability Roadmap
A.1. Phase 1 — Tool Capabilities
A.1.1 Goals
- Turn Ask ZARA from “RAG QA bot” → developer assistant that can:
- Look up specs in more targeted ways.
- Traverse relationships between engines, modules, signals.
- Propose actions (e.g. create an RFC skeleton).
A.1.2 Tooling concept
We define a tool layer in the API that the model can call via function-calling.
Example tool interface (conceptual):
type ZaraTool =
| { type: 'search_docs'; args: { query: string; topK?: number; filterSlugPrefix?: string } }
| { type: 'get_engine_spec'; args: { engineId: string } }
| { type: 'list_engines'; args: { hub?: string; module?: string } }
| { type: 'get_template'; args: { kind: 'rfc' | 'data_contract' | 'engine_spec' } };
The model gets a tool schema in its system prompt and can respond with:
{
"tool_call": {
"type": "search_docs",
"args": { "query": "COSE risk scoring micro engine", "topK": 5 }
}
}
The backend:
- Parses the tool call.
- Executes the corresponding function.
- Feeds the result back to the model as additional context.
A.1.3 Concrete tools for ZAYAZ
A.1.3.1 search_docs
- Inputs:
query: stringtopK?: numberfilterSlugPrefix?: string(/micro-engines,/system-info,/under-development…)
- Output:
{ hits: SearchResult[] }(we already have this shape)
A.1.3.2 get_engine_spec
- Inputs:
engineId: string (e.g.pef-me,sem,carbie)
- Implementation (v1):
- Map engine IDs → canonical spec slugs (e.g.
/micro-engines/pef-me). - Use
search_docswith a strong slug filter.
- Map engine IDs → canonical spec slugs (e.g.
- Implementation (v2):
- Maintain a small JSON registry:
config/system/engine_registry.jsonkeyed by engineId.- Contains slug, hub, module, owners, status, tags.
- Maintain a small JSON registry:
- Output:
engineMeta(ID, title, slug, hub, module, status).coreSections(overview, API, tables, inputs/outputs).relationships(signals, tables, other engines).
This powers queries like:
- “Show me everything about PEF-ME.”
- “Which engines depend on ZHIF?”
…and gives ZARA structured data to reason over.
A.1.3.3 list_engines
- Inputs: filters on
hub,module,status,family. - Output: array of
engineMetafrom the registry.
Good for:
- “What micro-engines live in the Computation Hub?”
- “Which engines are under development for EPDEX?”
A.1.3.4 get_template
- Inputs:
kind ('rfc' | 'data_contract' | 'engine_spec' etc.). - Implementation:
- Store templates in static MD/JSON files under config/system/templates.
- Tools read them and return a normalized structure: name, description, sections, fields.
- Output: a template + guidance that ZARA can fill in / customize.
Use cases:
- “Draft an RFC for changing PEF-ME to support multi-tenant setup.”
- “Give me a proto data contract for dim_suppliers.”
A.2. Phase 2 — Conversation Persistence
A.2.1 Goals
- ZARA should remember context across turns:
- Current engine and module(s) being discussed.
- Decisions already made in the conversation.
- Follow-ups like “do the same for COSE.”
A.2.2 Architecture options
We can keep the design storage-agnostic initially:
- Option A (simple): store recent turns in the browser only, but:
- Include them in each API call as
history. - Server does not persist anything long-term.
- Include them in each API call as
- Option B (internal): add a small session store:
- Minimal DB (SQLite, LiteFS, Postgres, Redis).
- Keyed by
sessionIdpassed from frontend (cookie or URL param). - Each Ask call:
- Loads last N turns.
- Appends new turn + answer.
- Saves back.
We can start with Option A and leave B as a config flag.
A.2.3 API changes
Current request body is something like:
interface AskRequestBody {
question: string;
}
Proposed:
interface AskHistoryTurn {
role: 'user' | 'assistant';
content: string;
}
interface AskRequestBody {
sessionId?: string; // optional, for future persistence
question: string;
history?: AskHistoryTurn[]; // recent turns
}
Backend:
- Validates question.
- Trims / compresses history to a sane token budget (e.g. last 4–6 messages).
- Builds a combined chat prompt with:
- System + developer messages (ZAYAZ guardrails).
- Relevant index context.
- History.
- Latest question.
Response remains:
interface AskResponseBody {
answer: string;
sources: { slug: string; score: number }[];
}
Later we can add:
memorySummary?: string; // internal summary for long sessions
A.3. Phase 3 — Live Engine Introspection
A.3.1 Concept
ZARA should be able to:
“Look into an engine spec the way a reviewer or architect would, not just search around it.”
Examples:
- “Explain the full I/O contract for PEF-ME.”
- “Compare PEF-ME and SEM in terms of uncertainty propagation.”
- “Which tables and signals does COSE depend on?”
A.3.2 Data sources
We already have:
- Engine specs under
computation-hub/mice-micro-engines/*.mdx. - Signals registry under
config/system/signal_registry.json. - Table relationships under
config/system/table_relationships.json. - Schemas under
docusaurus/static/schemas/*.
Engine introspection will:
- Use engine_registry to resolve engineId → paths.
- Parse or retrieve those docs with a mix of:
- Targeted search (for specific sections by heading).
- Direct file read (if needed, e.g. Node/TS snippet).
- Build structured metadata for the engine:
inputs,outputs,tables,signals,dependencies.
A.3.3 Tools for introspection
New tools that complement get_engine_spec:
A.3.3.1 get_engine_io
- Input:
engineId - Output: structured view like:
interface EngineIO {
engineId: string;
tablesIn: string[];
tablesOut: string[];
signalsIn: string[];
signalsOut: string[];
schemas: string[]; // schema filenames / slugs
}
Implementation strategy:
- Prefer JSON registries (if we maintain one).
- Else, do targeted search:
- Query
engineIdwith filters. - Look for “All Tables”, “Inputs”, “Outputs” headings.
- Apply heuristics or mini-parsers on those sections.
- Query
A.3.3.2 get_related_engines
- Input:
engineId - Output: engines that:
- Share tables.
- Share signals.
- Are marked as upstream/downstream in relationships JSON.
Use cases:
- “What’s upstream of PEF-ME?”
- “If we change dim_suppliers, which engines are impacted?”
A.3.3.3 diff_engines
- Inputs:
engineIdA,engineIdB,focus(e.g.'io'|'risk'|'uncertainty'). - Output: a structured diff summary for the model to turn into prose.
A.4. Phase 4 — Hardening the API (Production-ready)
A.4.1 Security & access
- Auth layer:
- API key or internal SSO for developers.
- Later: per-user auth with scopes (read / propose / execute tools).
- Rate limits:
- Per IP/key.
- Per session.
- Distinct budgets for
/searchvs/askvs heavy tools.
A.4.2 Observability
- Structured logs for:
- Incoming requests (anonimized question & meta).
- Tool invocations (type, duration, success/failure).
- Model latency & token usage.
- Dashboard:
- Volume of queries.
- Top endpoints.
- Error & timeout rates.
A.4.3 Reliability & infra
Fly.io:- Multiple regions (later).
min_machines_runningtuned for responsiveness.- Health checks for
/health.
- Index management:
- Stable process to regenerate index (CI job / manual script).
- Versioned index file (e.g.
index.vN.jsonlwith symlinkindex.jsonl). - Optional: store index in an object store (R2, S3) and sync.
A.5. Roadmap Summary
v0.1 (today)
- Ask ZARA:
- Single-turn Q&A with citations.
- RAG over ZAYAZ docs.
- Search:
- Semantic search with filters, citations, dedicated page.
v0.2 — Tools
- Implement backend tool layer:
search_docs,get_engine_spec,list_engines,get_template.
- Expose limited tool calling to the model.
- Keep logging & CORS simple.
v0.3 — Memory
- Extend /ask to accept history and optional sessionId.
- Keep in-browser history for now.
- Add simple truncation/summary logic on the backend.
v0.4 — Engine Introspection
- Introduce:
engine_registry.json.- Introspection tools:
get_engine_io,get_related_engines,diff_engines.
- Teach ZARA to answer “pipeline” and “impact” questions.
v1.0 — Production-grade
- Add auth, rate limiting, stronger observability.
- Consider moving memory & metadata to a managed DB.
- Tune deployment strategy (regions, scaling) based on usage.
APPENDIX B - Early Ask ZARA Architecture (v0.1)
B.1. Current State (v0.1)
B.1.1 High-level flow
- Frontend
- Docusaurus page /ask renders the Ask ZARA chat UI.
- User enters a question → frontend
POSTs to POST {ASK_API_BASE}/ask. - We already:
- Maintain a local chat transcript in the browser.
- Display citations as
[n] source /path/to/doc.
- Backend
zayaz-search-apiexposes:POST /search— semantic search over the docs index (used by /search).POST /ask— Ask ZARA chat endpoint.
- The API:
- Loads embedded search index from /app/data/index.jsonl.
- Calls OpenAI:
- Embeddings:
text-embedding-3-small - Chat model:
gpt-4.1-mini(configurable viaZAYAZ_ASK_MODEL).
- Embeddings:
- Implements a RAG loop:
- Take the user question.
- Query our vector index.
- Build a system + context prompt with top-K search hits.
- Ask the model for an answer.
- Return: answer text + the list of sources used.
B.1.2 What we already have
- Vector search over:
- MDX pages (specs, system info, guides).
- JSON registries (signals, table relationships, schemas).
- Future Excel-derived docs, if we ingest them.
- Citations and hyperlinks in the UI.
- CORS set up so both GitHub Codespaces and specs.zayaz.io can call it.