Jira progress: loading…

ASK-ZARA

Internal AI assistant

Ask ZARA is the internal AI assistant for the ZAYAZ platform.

It helps developers, architects, and auditors understand the platform by answering questions using the full semantic index of the ZAYAZ documentation, code examples, registries, schemas, and relationships.

Ask ZARA is designed specifically for complex architecture reasoning, impact analysis, and developer debugging workflows.

Note: Ask ZARA v2 introduced architecture-aware retrieval mechanisms that expand documents based on metadata relationships, clusters, and dependency signals. These capabilities dramatically improve architecture explanation and impact analysis.

Ask ZARA System Architecture

1. Purpose

The ZAYAZ platform contains:

micro-engines (MICE)
schema registries
signal registries
Excel specifications
documentation
associated code examples
table relationships
governance pipelines

Ask ZARA provides a conversational interface that can reason across all of these sources.

Typical use cases:

understanding how engines interact
tracing data flows across the platform
debugging architecture issues
performing impact analysis before changing registries or engines
exploring platform subsystems
supporting internal audits and verification

Ask ZARA is intended for:

ZAYAZ developers
system architects
platform maintainers
auditors / verifiers

It is not intended for end users or clients.

2. Models

Ask ZARA supports multiple reasoning modes.

Mode	Model	Purpose
Fast	`gpt-4.1-mini`	Default. Quick answers and documentation lookup
Deep reasoning	`gpt-4.1`	Cross-document synthesis
Architect reasoning	`gpt-5.x`	Complex architecture analysis

The interface allows switching model mode temporarily.

Fast mode remains the default for cost and speed.

3. How Ask ZARA Works

3.1 Ask ZARA Reasoning Pipeline

Ask ZARA transforms raw search results into structured architecture context before invoking the reasoning model. This allows the system to explain subsystems, trace dependencies, and perform impact analysis across the ZAYAZ platform.

Ask ZARA uses a retrieval-augmented architecture.

Pipeline:

Ask ZARA does not reason over flat search hits; it builds structured architecture paths from retrieved documentation, code, schema, and registry context before invoking the selected reasoning model.

The system retrieves documentation and system data from the ZAYAZ search index.

3.2. The ZAYAZ Knowledge Graph inside Ask ZARA

The ZAYAZ platform forms a knowledge graph of interconnected system primitives including engines, signals, tables, schemas, registries, documentation, and code.

Ask ZARA indexes these elements and reconstructs architecture paths that allow the reasoning model to analyze subsystem relationships and perform impact analysis across the platform.

3.3. Dependency-Aware Retrieval (Architecture Graph Expansion)

3.4. The ZAYAZ Index Graph Inside index.jsonl

note

What You’re Looking At

Ask ZARA does not search documents — it navigates an architecture graph.

Although the index is stored as a flat index.jsonl file, each entry represents a node in the ZAYAZ architecture ecosystem: documentation sections, engines, signals, tables, schema fields, and code fragments.

Metadata such as used_by_engines, table_name, classification, and architecture_layer creates relationships between these nodes, forming a lightweight knowledge graph.

This allows Ask ZARA to follow architecture paths like: engine → signal → table → schema → downstream engine

Because of this structure, Ask ZARA can answer questions like:

Which engines depend on this signal?
What breaks if this API contract changes?
Where is this schema field used?

Instead of retrieving isolated text fragments, Ask ZARA performs architecture-aware retrieval and reasoning across the ZAYAZ platform.

3.4.1. Diagram Reading Guide

The diagram above illustrates how the Ask ZARA search index behaves as a lightweight architecture knowledge graph. Although the underlying storage format is a flat index.jsonl file, each indexed entry represents a node in the ZAYAZ architecture ecosystem.

Node Types

Each box in the diagram represents a type of indexed node.

Node Type	Description
Engine / Doc Section	Sections from system documentation or engine specifications.
Associated Code	Code snippets or implementation files linked to a documentation section.
Signal	System signals used for communication between engines and modules.
Table	Data tables derived from registries, Excel sheets, or schemas.
Schema Field	Individual fields inside schemas or tables.
Relationship	Structural links between tables such as foreign keys or logical associations.

These nodes form the core architecture entities that Ask ZARA reasons about.

Structural Links

Solid arrows in the diagram represent structural relationships extracted during indexing.

Examples include:

an engine produces or consumes signals
a document describes a signal or table
a table contains schema fields
tables are connected via relationships

These links allow Ask ZARA to trace architecture paths such as:

engine → signal → table → schema field

table → relationship → downstream tables

Metadata Overlays

The dashed arrows represent metadata overlays that enrich each node.

Examples include:

architecture_cluster
architecture_layer
classification
tags

These metadata attributes help Ask ZARA:

group related subsystems
filter results by architectural layer
prioritize relevant entities during reranking
build higher-level explanations of platform components

Why This Matters

Because the indexed data includes both content and relationships, Ask ZARA can go beyond simple document retrieval.

Instead of answering questions by quoting text alone, it can analyze how components interact across the ZAYAZ platform.

This enables advanced queries such as:

Which engines depend on this signal?
What breaks if this table schema changes?
Which modules consume this API contract?

In effect, the index becomes a queryable architecture map that the reasoning model can explore.

4. Indexed Sources

The search index contains multiple structured sources.

Source	Description
MDX documentation	Platform design documentation
Associated code	Code examples linked to documentation
Explicit snippets	Code files referenced directly in docs
JSON registries	signals, schema fields, relationships
Excel registries	signal registries and other structured specifications

Each indexed document includes metadata such as:

kind
link_group
architecture_cluster
architecture_layer
classification
tags

This metadata enables architectural reasoning.

5. Document Kinds

The index assigns a kind to each document.

Kind	Meaning
`doc_page`	MDX documentation page
`doc_section`	MDX section
`associated_code`	Code example linked to documentation
`excel_sheet`	Excel specification sheet
`schema_field`	Database schema field
`signal`	Signal definition
`table_relationship`	Relationship between tables

This allows ZARA to distinguish between:

design documentation
implementation examples
structured system data

6. Architecture Clustering

Ask ZARA groups related documents using architecture clusters.

Cluster sources:

link_group (engine or subsystem)
table name
documentation slug root

Example:

architecture_cluster: aae

This groups together:

engine documentation
associated code
related signals
schema definitions
table relationships

This enables ZARA to reason about entire subsystems instead of isolated snippets.

7. Architecture Layers

ZARA also extracts architecture layers from documentation tags.

Examples:

architecture_layer: tier-0, assurance, governance

These layers describe conceptual placement within the platform.

Typical layers include:

tier-0 / tier-1 / tier-2 / tier-3
governance
assurance
validation
trust
mice
meta-signal

Layers help ZARA explain how subsystems fit into the overall platform architecture.

8. Engine Classification

Micro-engines are classified by architectural role.

Examples:

classification: Assurance-Engine
classification: Contract-Engine
classification: Validation-Engine

Classification helps ZARA reason about:

deterministic engines
transformation engines
validation logic
orchestration engines

9. Architecture Path Context

Ask ZARA groups retrieved documents into architecture paths before passing them to the LLM.

Example context structure:

Architecture path: aae

- Documentation
- Implementation
- Tables and schema
- Signals
- Relationships

This allows the model to understand end-to-end subsystem structure.

The result is significantly better architecture explanations.

10. Conversation-Aware Retrieval

Ask ZARA also uses recent conversation history when retrieving documents.

This allows multi-step debugging conversations like:

User: Explain the Autonomous Assurance Engine User: How does it interact with validation rules? User: What would break if we modify those rules?

The retrieval query incorporates earlier user questions to preserve context.

11. Impact Analysis

Ask ZARA is optimized for debugging and change analysis.

When users ask questions like:

What breaks if we change compute_method_registry?

ZARA attempts to trace:

upstream inputs
dependent tables
signals
downstream consumers
affected engines

This helps developers safely evolve the platform.

12. Example Questions

Typical developer questions include:

How does PEF-ME connect to other micro-engines in the computation hub?

Explain the assurance pipeline and how AAE interacts with validation.

What breaks if compute_method_registry changes?

Which tables are produced by the GHG aggregation engines?

How does the Tagged Accounting Crawler feed the reporting pipeline?

13. Interface

The Ask ZARA interface provides:

conversational chat
model selection
source links for every answer
TODO tag search
architecture-aware responses

Each answer displays:

answered with Deep reasoning (gpt-4.1)

and includes links to source documentation.

14. Future Improvements

Potential future upgrades include:

architecture graph traversal
dependency chain visualization
subsystem diagrams
registry diff analysis
automated architecture validation

15. Version

Current version:

Ask ZARA v2

Key capabilities:

linked retrieval
architecture path grouping
architecture layers
subsystem clustering
conversation-aware retrieval
model switching
impact analysis prompting

Ask ZARA is designed to make the extremely complex ZAYAZ architecture understandable and navigable for engineers.

Here is a clean MDX section you can append to the previous page. It explains the architecture clearly and includes a diagram-friendly layout suitable for Docusaurus.

You can place this after section 3 or at the end of the page.

Ask ZARA Architecture Overview

Ask ZARA is built as a retrieval-augmented architecture reasoning system tailored specifically for the ZAYAZ platform.

It combines:

structured platform documentation
indexed code examples
registry and schema metadata
semantic search
architecture-aware context generation
LLM reasoning

The goal is to allow developers to explore and debug the ZAYAZ architecture conversationally.

High-Level Architecture

The architecture pipeline performs multiple steps before generating a response.

System Components

Documentation Indexer

The ZAYAZ Search Indexer scans and processes platform documentation and structured registries.

Sources indexed include:

MDX documentation
associated code examples
explicit snippet files
JSON registries
Excel registries
schema definitions
table relationships

Each indexed document receives metadata such as:

kind
link_group
architecture_cluster
architecture_layer
classification
tags

These metadata fields enable architecture-aware retrieval.

Embedding Generation

Each indexed document is converted into a semantic embedding using:

text-embedding-3-small

Embeddings are stored in the search index and used for similarity search.

Vector Search

When a developer asks a question:

The question is embedded
The vector index is searched
The most relevant documents are retrieved

Example retrieval results might include:

documentation sections
code examples
signals
schema fields
table relationships

Linked Retrieval Expansion

Retrieved documents are expanded using structural metadata.

Expansion sources include:

link_group (engine / subsystem)
table_name
schema_field
signal
table_relationship

This step ensures that ZARA sees complete architectural slices, not isolated snippets.

Lightweight Reranking

Retrieved documents are reranked based on:

semantic similarity
document kind
architectural relevance

For example:

documentation sections rank above snippets
explicit code examples rank above loosely associated code

Architecture Path Builder

Documents are grouped into architecture paths before being sent to the LLM.

Example:

Architecture path: aae

Documentation
Implementation
Tables and schema
Signals
Relationships

This step allows ZARA to reason about entire subsystems.

LLM Reasoning

The structured architecture context is sent to the selected model.

Typical models:

Mode	Model
Fast	gpt-4.1-mini
Deep reasoning	gpt-4.1
Architect reasoning	GPT-5

The prompt instructs the model to:

prioritize ZAYAZ documentation
analyze architecture paths
perform impact analysis
reference sources explicitly

Response Generation

The final response includes:

a structured explanation
architecture reasoning
relevant documentation references
source links

Example:

answered with Deep reasoning (gpt-4.1)

Sources appear directly beneath the answer.

Key Design Principles

Ask ZARA is built around several architectural principles.

Architecture First

ZARA is optimized for understanding system architecture, not just answering documentation queries.

Linked Knowledge

Documentation, code, schemas, and registries are linked into a unified knowledge graph.

Structured Context

Architecture paths and metadata allow the LLM to reason about system structure.

Developer-Oriented Reasoning

ZARA prioritizes:

debugging workflows
architecture exploration
impact analysis
cross-engine reasoning

Example Developer Workflow

A typical debugging session may look like:

Developer:
Explain the Autonomous Assurance Engine.

Developer:
How does it interact with validation rules?

Developer:
What breaks if we change compute_method_registry?

Ask ZARA retrieves relevant architecture paths and traces:

upstream dependencies
downstream consumers
related engines
schema relationships

This allows developers to safely evolve the platform.

System Summary

Ask ZARA v2 combines:

semantic search
architecture clustering
subsystem path grouping
conversation-aware retrieval
multi-model reasoning

This architecture allows developers to navigate and debug the complex ZAYAZ platform through natural language.

Indexing Architecture

Before Ask ZARA can answer questions, the ZAYAZ platform documentation and registries are converted into a semantic architecture index.

This indexing process runs offline and produces the index.jsonl file used by the search API.

Indexing Steps

Source Collection

The indexer scans the ZAYAZ documentation repository and collects structured sources:

MDX documentation
associated code examples
explicit code snippets
JSON registries
Excel registries
schema definitions
table relationships

These represent the design, implementation, and data model layers of the platform.

Document Chunking

Large documents are split into smaller semantic chunks such as:

documentation sections
schema field definitions
signal definitions
code example blocks

Chunking ensures that semantic search retrieves precise architecture fragments rather than entire documents.

Metadata Enrichment

Each chunk receives structured metadata, including:

kind
slug
title
heading
link_group
architecture_cluster
architecture_layer
classification
tags

This metadata allows the retrieval engine to reason about system structure.

Embedding Generation

Each chunk is converted into a vector embedding using:

text-embedding-3-small

Embeddings capture semantic meaning and enable similarity search.

Index Generation

All documents and embeddings are written to the index:

data/index.jsonl

Each line contains a fully indexed document:

content
metadata
embedding

Search API

The ZAYAZ Search API loads the index and provides endpoints for:

semantic search
TODO tag discovery
architecture-aware retrieval
Ask ZARA queries

Relationship to Ask ZARA

The index provides the knowledge layer for Ask ZARA.

Ask ZARA then adds:

linked retrieval
architecture path grouping
conversation-aware retrieval
LLM reasoning

Together, these components allow ZARA to perform deep architectural reasoning across the entire ZAYAZ platform.

Future Extensions — Tool Invocation

When we talk about “tools” for Ask ZARA we mean:

Server-side capabilities that the model can invoke as functions to do things for the user — not just answer from static context.

Examples:

Search tool — searchDocs(query, filters) (even richer than what we already do implicitly).
Spec explorer tool — getSpecBySlug (slug) or listEngines() (e.g. “show me all micro-engines that use PEF-ME”).
Engine introspection tool — getEngineSpec(engineId) (parse the engine’s spec file into structured metadata).
Template/tooling tool — e.g. getRFCTemplate(type) (return a template for change requests, data contracts, etc).

Under the hood that means:

We keep using the same logs, but:
- We tag tool calls in logs for observability.
- We may add structured tool-call logs for debugging.
Tools typically operate over:
- The existing search index (fast, stable).
- The file system / repo (for structured spec parsing or JSON registries).
- Optional metadata stores (e.g. per-engine metadata, memory store, etc).

Architecture Context Construction

A key capability of Ask ZARA is its ability to transform raw search results into architecture-aware context before sending them to the LLM.

Instead of presenting a flat list of snippets, ZARA builds architecture paths that reflect the structure of the ZAYAZ platform.

This allows the model to reason about subsystems, engines, schemas, and signals as connected components.

Architecture Path Structure

Each architecture path groups documents that belong to the same subsystem.

Example structure:

Architecture path: aae

Documentation
  - Autonomous Assurance Engine overview
  - Validation interaction documentation

Implementation
  - aae/validator.ts
  - governance-check.ts

Tables and schema
  - assurance_registry
  - validation_rule_registry

Signals
  - assurance_score
  - governance_violation_flag

Relationships
  - validation_rule_registry → assurance_registry

This representation allows the model to see how components interact inside the subsystem.

Context Metadata

Each document block in the architecture path contains structured metadata:

slug: /micro-engines/aae
kind: doc_section
architecture_cluster: aae
architecture_layer: tier-0, assurance, governance
classification: Assurance-Engine
tags: mice, assurance, validation, governance, trust

These signals allow ZARA to reason about:

subsystem boundaries
architectural layers
engine roles
schema dependencies

Why Architecture Paths Matter

Without architecture paths, the model would see something like:

doc snippet
code snippet
schema field
signal
documentation paragraph

This flat structure makes it difficult for the model to understand system relationships.

With architecture paths, ZARA sees:

Subsystem: Autonomous Assurance Engine
  documentation
  implementation
  tables
  signals
  relationships

This dramatically improves the model’s ability to perform:

architecture explanations
cross-engine reasoning
debugging analysis
impact analysis

Example Developer Question

Example debugging query:

What breaks if we modify compute_method_registry?

Ask ZARA will attempt to trace:

engines that reference the registry
signals produced by those engines
tables storing those signals
downstream reporting pipelines

The architecture path structure makes this reasoning possible.

Summary

Architecture paths are the core mechanism that allows Ask ZARA to move beyond simple document search and perform architecture-aware reasoning across the ZAYAZ platform.

Combined with linked retrieval, subsystem clustering, and architecture layers, this approach enables developers to explore the platform in a way that mirrors how engineers think about complex systems.

Design Philosophy of Ask ZARA

Ask ZARA was designed to support developers working inside a large, deeply interconnected platform. Traditional documentation search is insufficient for such systems because the most important information is often distributed across multiple documents, code examples, schema definitions, and registries.

Instead of treating documentation as isolated pages, Ask ZARA treats the ZAYAZ platform as an architecture graph.

The system therefore focuses on three core principles:

Architecture-aware retrieval

ZARA retrieves related information across:

documentation
implementation examples
schema definitions
signal registries
table relationships

This ensures that answers reflect how components interact, not just how they are described individually.

Subsystem context

Rather than presenting flat search results, ZARA constructs architecture paths that group information belonging to the same subsystem or micro-engine cluster.

This allows the model to reason about:

engine responsibilities
upstream and downstream dependencies
shared primitives and tier relationships
signal and table flows

Retrieval before reasoning

ZARA is intentionally built as a retrieval-first system.

The LLM is not expected to know the ZAYAZ platform.
Instead, it is given structured architectural context extracted directly from the platform documentation and registries.

The model's role is therefore to:

synthesize the retrieved architecture
explain system behavior
trace dependencies
assist with debugging and design reasoning

Designed for engineers

Ask ZARA is not a general chatbot.
It is an engineering reasoning assistant for the ZAYAZ platform.

Typical use cases include:

understanding complex engine interactions
tracing signal or table dependencies
evaluating the impact of architectural changes
identifying potential overlap between engines
assisting with debugging and refactoring

By combining semantic retrieval with architecture-aware context construction, Ask ZARA enables developers to explore the platform in the same structural way that the system itself is designed.

APPENDIX A - Ask ZARA Capability Roadmap

A.1. Phase 1 — Tool Capabilities

A.1.1 Goals

Turn Ask ZARA from “RAG QA bot” → developer assistant that can:
- Look up specs in more targeted ways.
- Traverse relationships between engines, modules, signals.
- Propose actions (e.g. create an RFC skeleton).

A.1.2 Tooling concept

We define a tool layer in the API that the model can call via function-calling.

Example tool interface (conceptual):

type ZaraTool =
  | { type: 'search_docs'; args: { query: string; topK?: number; filterSlugPrefix?: string } }
  | { type: 'get_engine_spec'; args: { engineId: string } }
  | { type: 'list_engines'; args: { hub?: string; module?: string } }
  | { type: 'get_template'; args: { kind: 'rfc' | 'data_contract' | 'engine_spec' } };

The model gets a tool schema in its system prompt and can respond with:

{
  "tool_call": {
    "type": "search_docs",
    "args": { "query": "COSE risk scoring micro engine", "topK": 5 }
  }
}

The backend:

Parses the tool call.
Executes the corresponding function.
Feeds the result back to the model as additional context.

A.1.3 Concrete tools for ZAYAZ

A.1.3.1 search_docs

Inputs:
- query: string
- topK?: number
- filterSlugPrefix?: string (/micro-engines, /system-info, /under-development…)
Output:
- { hits: SearchResult[] } (we already have this shape)

A.1.3.2 get_engine_spec

Inputs:
- engineId: string (e.g. pef-me, sem, carbie)
Implementation (v1):
- Map engine IDs → canonical spec slugs (e.g. /micro-engines/pef-me).
- Use search_docs with a strong slug filter.
Implementation (v2):
- Maintain a small JSON registry:
  - config/system/engine_registry.json keyed by engineId.
  - Contains slug, hub, module, owners, status, tags.
Output:
- engineMeta (ID, title, slug, hub, module, status).
- coreSections (overview, API, tables, inputs/outputs).
- relationships (signals, tables, other engines).

This powers queries like:

“Show me everything about PEF-ME.”
“Which engines depend on ZHIF?”

…and gives ZARA structured data to reason over.

A.1.3.3 list_engines

Inputs: filters on hub, module, status, family.
Output: array of engineMeta from the registry.

Good for:

“What micro-engines live in the Computation Hub?”
“Which engines are under development for EPDEX?”

A.1.3.4 get_template

Inputs: kind ('rfc' | 'data_contract' | 'engine_spec' etc.).
Implementation:
- Store templates in static MD/JSON files under config/system/templates.
- Tools read them and return a normalized structure: name, description, sections, fields.
Output: a template + guidance that ZARA can fill in / customize.

Use cases:

“Draft an RFC for changing PEF-ME to support multi-tenant setup.”
“Give me a proto data contract for dim_suppliers.”

A.2. Phase 2 — Conversation Persistence

A.2.1 Goals

ZARA should remember context across turns:
- Current engine and module(s) being discussed.
- Decisions already made in the conversation.
- Follow-ups like “do the same for COSE.”

A.2.2 Architecture options

We can keep the design storage-agnostic initially:

Option A (simple): store recent turns in the browser only, but:
- Include them in each API call as history.
- Server does not persist anything long-term.
Option B (internal): add a small session store:
- Minimal DB (SQLite, LiteFS, Postgres, Redis).
- Keyed by sessionId passed from frontend (cookie or URL param).
- Each Ask call:
  1. Loads last N turns.
  2. Appends new turn + answer.
  3. Saves back.

We can start with Option A and leave B as a config flag.

A.2.3 API changes

Current request body is something like:

interface AskRequestBody {
  question: string;
}

Proposed:

interface AskHistoryTurn {
  role: 'user' | 'assistant';
  content: string;
}

interface AskRequestBody {
  sessionId?: string;          // optional, for future persistence
  question: string;
  history?: AskHistoryTurn[];  // recent turns
}

Backend:

Validates question.
Trims / compresses history to a sane token budget (e.g. last 4–6 messages).
Builds a combined chat prompt with:
- System + developer messages (ZAYAZ guardrails).
- Relevant index context.
- History.
- Latest question.

Response remains:

interface AskResponseBody {
  answer: string;
  sources: { slug: string; score: number }[];
}

Later we can add:

memorySummary?: string;   // internal summary for long sessions

A.3. Phase 3 — Live Engine Introspection

A.3.1 Concept

ZARA should be able to:

“Look into an engine spec the way a reviewer or architect would, not just search around it.”

Examples:

“Explain the full I/O contract for PEF-ME.”
“Compare PEF-ME and SEM in terms of uncertainty propagation.”
“Which tables and signals does COSE depend on?”

A.3.2 Data sources

We already have:

Engine specs under computation-hub/mice-micro-engines/*.mdx.
Signals registry under config/system/signal_registry.json.
Table relationships under config/system/table_relationships.json.
Schemas under docusaurus/static/schemas/*.

Engine introspection will:

Use engine_registry to resolve engineId → paths.
Parse or retrieve those docs with a mix of:

Targeted search (for specific sections by heading).
Direct file read (if needed, e.g. Node/TS snippet).

Build structured metadata for the engine:

inputs, outputs, tables, signals, dependencies.

A.3.3 Tools for introspection

New tools that complement get_engine_spec:

A.3.3.1 get_engine_io

Input: engineId
Output: structured view like:

interface EngineIO {
  engineId: string;
  tablesIn: string[];
  tablesOut: string[];
  signalsIn: string[];
  signalsOut: string[];
  schemas: string[];          // schema filenames / slugs
}

Implementation strategy:

Prefer JSON registries (if we maintain one).
Else, do targeted search:
- Query engineId with filters.
- Look for “All Tables”, “Inputs”, “Outputs” headings.
- Apply heuristics or mini-parsers on those sections.

A.3.3.2 get_related_engines

Input: engineId
Output: engines that:
Share tables.
Share signals.
Are marked as upstream/downstream in relationships JSON.

Use cases:

“What’s upstream of PEF-ME?”
“If we change dim_suppliers, which engines are impacted?”

A.3.3.3 diff_engines

Inputs: engineIdA, engineIdB, focus (e.g. 'io' | 'risk' | 'uncertainty').
Output: a structured diff summary for the model to turn into prose.

A.4. Phase 4 — Hardening the API (Production-ready)

A.4.1 Security & access

Auth layer:
- API key or internal SSO for developers.
- Later: per-user auth with scopes (read / propose / execute tools).
Rate limits:
- Per IP/key.
- Per session.
- Distinct budgets for /search vs /ask vs heavy tools.

A.4.2 Observability

Structured logs for:
- Incoming requests (anonimized question & meta).
- Tool invocations (type, duration, success/failure).
- Model latency & token usage.
Dashboard:
- Volume of queries.
- Top endpoints.
- Error & timeout rates.

A.4.3 Reliability & infra

Fly.io:
- Multiple regions (later).
- min_machines_running tuned for responsiveness.
- Health checks for /health.
Index management:
- Stable process to regenerate index (CI job / manual script).
- Versioned index file (e.g. index.vN.jsonl with symlink index.jsonl).
- Optional: store index in an object store (R2, S3) and sync.

A.5. Roadmap Summary

v0.1 (today)

Ask ZARA:
- Single-turn Q&A with citations.
- RAG over ZAYAZ docs.
Search:
- Semantic search with filters, citations, dedicated page.

v0.2 — Tools

Implement backend tool layer:
- search_docs, get_engine_spec, list_engines, get_template.
Expose limited tool calling to the model.
Keep logging & CORS simple.

v0.3 — Memory

Extend /ask to accept history and optional sessionId.
Keep in-browser history for now.
Add simple truncation/summary logic on the backend.

v0.4 — Engine Introspection

Introduce:
- engine_registry.json.
- Introspection tools: get_engine_io, get_related_engines, diff_engines.
Teach ZARA to answer “pipeline” and “impact” questions.

v1.0 — Production-grade

Add auth, rate limiting, stronger observability.
Consider moving memory & metadata to a managed DB.
Tune deployment strategy (regions, scaling) based on usage.

APPENDIX B - Early Ask ZARA Architecture (v0.1)

B.1. Current State (v0.1)

B.1.1 High-level flow

Frontend
- Docusaurus page /ask renders the Ask ZARA chat UI.
- User enters a question → frontend POSTs to POST {ASK_API_BASE}/ask.
- We already:
  - Maintain a local chat transcript in the browser.
  - Display citations as [n] source /path/to/doc.
Backend
- zayaz-search-api exposes:
  - POST /search — semantic search over the docs index (used by /search).
  - POST /ask — Ask ZARA chat endpoint.
- The API:
  - Loads embedded search index from /app/data/index.jsonl.
  - Calls OpenAI:
    - Embeddings: text-embedding-3-small
    - Chat model: gpt-4.1-mini (configurable via ZAYAZ_ASK_MODEL).
  - Implements a RAG loop:
    1. Take the user question.
    2. Query our vector index.
    3. Build a system + context prompt with top-K search hits.
    4. Ask the model for an answer.
    5. Return: answer text + the list of sources used.

B.1.2 What we already have

Vector search over:
- MDX pages (specs, system info, guides).
- JSON registries (signals, table relationships, schemas).
- Future Excel-derived docs, if we ingest them.
Citations and hyperlinks in the UI.
CORS set up so both GitHub Codespaces and specs.zayaz.io can call it.

GitHub Repo Request for Change (RFC)

1. Purpose​

2. Models​

3. How Ask ZARA Works​

3.1 Ask ZARA Reasoning Pipeline​

3.2. The ZAYAZ Knowledge Graph inside Ask ZARA​

3.3. Dependency-Aware Retrieval (Architecture Graph Expansion)​

3.4. The ZAYAZ Index Graph Inside index.jsonl​

3.4.1. Diagram Reading Guide​

4. Indexed Sources​

5. Document Kinds​

6. Architecture Clustering​

7. Architecture Layers​

8. Engine Classification​

9. Architecture Path Context​

10. Conversation-Aware Retrieval​

11. Impact Analysis​

12. Example Questions​

13. Interface​

14. Future Improvements​

15. Version​

Ask ZARA Architecture Overview​

High-Level Architecture​

System Components​

Key Design Principles​

System Summary​

Indexing Architecture​

Indexing Steps​

Future Extensions — Tool Invocation​

Architecture Context Construction​

Architecture Path Structure​

Context Metadata​

Why Architecture Paths Matter​

Example Developer Question​

Design Philosophy of Ask ZARA​

Architecture-aware retrieval​

Subsystem context​

Retrieval before reasoning​

Designed for engineers​

APPENDIX A - Ask ZARA Capability Roadmap​

A.1. Phase 1 — Tool Capabilities​

A.1.1 Goals​

A.1.2 Tooling concept​

A.1.3 Concrete tools for ZAYAZ​

A.1.3.1 search_docs​

A.1.3.2 get_engine_spec​

A.1.3.3 list_engines​

A.1.3.4 get_template​

A.2. Phase 2 — Conversation Persistence​

A.2.1 Goals​

A.2.2 Architecture options​

A.2.3 API changes​

A.3. Phase 3 — Live Engine Introspection​

A.3.1 Concept​

A.3.2 Data sources​

A.3.3 Tools for introspection​

A.3.3.1 get_engine_io​

A.3.3.2 get_related_engines​

A.3.3.3 diff_engines​

A.4. Phase 4 — Hardening the API (Production-ready)​

A.4.1 Security & access​

A.4.2 Observability​

A.4.3 Reliability & infra​

A.5. Roadmap Summary​

APPENDIX B - Early Ask ZARA Architecture (v0.1)​

B.1. Current State (v0.1)​

B.1.1 High-level flow​

B.1.2 What we already have​

1. Purpose

2. Models

3. How Ask ZARA Works

3.1 Ask ZARA Reasoning Pipeline

3.2. The ZAYAZ Knowledge Graph inside Ask ZARA

3.3. Dependency-Aware Retrieval (Architecture Graph Expansion)

3.4. The ZAYAZ Index Graph Inside index.jsonl

3.4.1. Diagram Reading Guide

4. Indexed Sources

5. Document Kinds

6. Architecture Clustering

7. Architecture Layers

8. Engine Classification

9. Architecture Path Context

10. Conversation-Aware Retrieval

11. Impact Analysis

12. Example Questions

13. Interface

14. Future Improvements

15. Version

Ask ZARA Architecture Overview

High-Level Architecture

System Components

Key Design Principles

System Summary

Indexing Architecture

Indexing Steps

Future Extensions — Tool Invocation

Architecture Context Construction

Architecture Path Structure

Context Metadata

Why Architecture Paths Matter

Example Developer Question

Design Philosophy of Ask ZARA

Architecture-aware retrieval

Subsystem context

Retrieval before reasoning

Designed for engineers

APPENDIX A - Ask ZARA Capability Roadmap

A.1. Phase 1 — Tool Capabilities

A.1.1 Goals

A.1.2 Tooling concept

A.1.3 Concrete tools for ZAYAZ

A.1.3.1 search_docs

A.1.3.2 get_engine_spec

A.1.3.3 list_engines

A.1.3.4 get_template

A.2. Phase 2 — Conversation Persistence

A.2.1 Goals

A.2.2 Architecture options

A.2.3 API changes

A.3. Phase 3 — Live Engine Introspection

A.3.1 Concept

A.3.2 Data sources

A.3.3 Tools for introspection

A.3.3.1 get_engine_io

A.3.3.2 get_related_engines

A.3.3.3 diff_engines

A.4. Phase 4 — Hardening the API (Production-ready)

A.4.1 Security & access

A.4.2 Observability

A.4.3 Reliability & infra

A.5. Roadmap Summary

APPENDIX B - Early Ask ZARA Architecture (v0.1)

B.1. Current State (v0.1)

B.1.1 High-level flow

B.1.2 What we already have