iConsult Architecture Review

Financial Research Agent — Architecture Consultation

Based on Agentic Architectural Patterns for Building Multi-Agent Systems (Arsanjani & Bustos, Packt 2026). Consultation date: 2026-03-24.

Executive Brief
This multi-agent financial research pipeline demonstrates strong architectural foundations: a clean supervisor pattern, effective agent-as-tool delegation, parallel fan-out for search, and structured Pydantic outputs. The most impactful opportunity is closing the verification feedback loop — the VerifierAgent audits but cannot trigger rewrites, leaving the Self-Correction pattern unrealized. Adding retry logic with watchdog timeouts to the search fan-out (currently at 0% failure chain coverage) and introducing shared epistemic memory for cross-agent context would elevate this system from a well-structured L2 pipeline to a resilient, self-correcting L4 architecture.

System Under Review

OpenAI Agents SDK Multi-Agent Pipeline

Architecture

Centralized Supervisor with sequential pipeline: Plan → Search (parallel fan-out) → Write (with sub-agent tools) → Verify

Tech Stack

Python, OpenAI Agents SDK, asyncio, Pydantic, Rich (console UI), GPT-5.4 / o3-mini

Coordination

FinancialResearchManager orchestrator with Runner.run() invocations, asyncio.create_task for parallel search, agent.as_tool() for sub-analyst delegation

Security

No explicit security controls — no auth, no input validation, no rate limiting

Agent Roster

📋 PlannerAgent

Decomposes user queries into 5-15 financial search terms with reasoning

Pydantic structured output (FinancialSearchPlan)

🔍 SearchAgent

Executes web searches and produces 300-word financial summaries

WebSearchTool

📊 FundamentalsAnalystAgent

Analyzes company financials: revenue, margins, growth trajectory

Exposed as tool via as_tool()

⚠️ RiskAnalystAgent

Identifies risk factors: competitive threats, regulatory issues, supply chain

Exposed as tool via as_tool()

✍️ WriterAgent

Synthesizes search results + analyst outputs into markdown report with executive summary

fundamentals_analysis risk_analysis

VerifierAgent

Audits report for internal consistency, sourcing, and unsupported claims

Pydantic structured output (VerificationResult)

Maturity Assessment

Coordination & Planning
Established
Explainability & Compliance
Not Started
Robustness & Fault Tolerance
Not Started
Human-Agent Interaction
Emerging
Agent-Level Capabilities
Emerging
System-Level Infrastructure
Not Started
Continuous Improvement
Not Started

Maturity Scorecard

Each pattern assessed against the Ch. 12 rubric. Hover status badges for evidence.

Coordination & Planning

✓ Established
Pattern Level Status
Supervisor Architecture Basic Implemented
Multi-Agent Planning Basic Implemented
Hybrid Delegation Framework Intermediate Not Assessed
Shared Epistemic Memory Intermediate Missing
Consensus, Negotiation & Conflict Resolution Advanced Not Assessed

Explainability & Compliance

— Not Started
Pattern Level Status
Basic Audit Logging Basic Not Assessed
Instruction Fidelity Auditing Intermediate Missing
Persistent Instruction Anchoring Intermediate Missing
Causal Dependency Graph Intermediate Not Assessed
Fractal CoT Embedding Advanced Missing

Robustness & Fault Tolerance

— Not Started
Pattern Level Status
Watchdog Timeout Basic Not Assessed
Simple Retry Mechanism Basic Missing
Adaptive Retry with Prompt Mutation Intermediate Not Assessed
Auto-Healing Agent Resuscitation Intermediate Not Assessed
Incremental Checkpointing Intermediate Missing
Fallback Model Invocation Intermediate Not Assessed
Rate-Limited Invocation Intermediate Not Assessed
Majority Voting Across Agents Advanced Not Assessed
Trust Decay & Scoring Advanced Not Assessed
Canary Agent Testing Advanced Not Assessed

Human-Agent Interaction

▲ Emerging
Pattern Level Status
Human Calls Agent Basic Not Assessed
Agent Calls Human Basic Not Assessed
Agent Delegates to Agent Intermediate Implemented
Agent Calls Proxy Agent Intermediate Not Assessed

Agent-Level Capabilities

▲ Emerging
Pattern Level Status
Single Agent Baseline Basic Implemented
Agent-Specific Memory (Short-Term) Basic Not Assessed
Context-Aware Retrieval (Simple RAG) Basic Not Assessed
Advanced RAG Intermediate Not Assessed
Agentic RAG & Graph-Vector Hybrid Retrieval Advanced Not Assessed

System-Level Infrastructure

— Not Started
Pattern Level Status
Agent Authentication & Authorization Basic Not Assessed
Tool & Agent Registry Intermediate Not Assessed
Event-Driven Reactivity Intermediate Missing

Continuous Improvement

— Not Started
Pattern Level Status
Hybrid Workflow Agent Architecture (Planner + Scorer) Advanced Not Assessed
Coevolved Agent Training Advanced N/A
Preference-Controlled Synthetic Data Generation Advanced Not Assessed
Custom Evaluation Metrics Advanced Missing

Architecture: Current & Target

Blue = existing, red dashed = opportunities for growth, green = new additions in target state.

Current Architecture
flowchart TD
    User([User Query]) --> Manager[FinancialResearchManager]
    Manager --> Planner[PlannerAgent\no3-mini]
    Planner --> |FinancialSearchPlan| FanOut{Parallel Fan-Out}
    FanOut --> S1[SearchAgent 1]
    FanOut --> S2[SearchAgent 2]
    FanOut --> SN[SearchAgent N]
    S1 --> Collect[Collect Results]
    S2 --> Collect
    SN --> Collect
    Collect --> Writer[WriterAgent\ngpt-5.4]
    Writer -.-> |as_tool| Fundamentals[FundamentalsAnalystAgent]
    Writer -.-> |as_tool| Risk[RiskAnalystAgent]
    Fundamentals -.-> Writer
    Risk -.-> Writer
    Writer --> |FinancialReportData| Verifier[VerifierAgent\ngpt-5.4]
    Verifier --> Output([Print Report])

    classDef existing fill:#4A90D9,stroke:#333,color:white
    classDef tool fill:#2ECC71,stroke:#333,color:white
    classDef io fill:#95a5a6,stroke:#333,color:white

    class Manager,Planner,S1,S2,SN,Writer,Verifier existing
    class Fundamentals,Risk tool
    class User,Output,FanOut,Collect io
    
Target Architecture
flowchart TD
    User([User Query]) --> Manager[FinancialResearchManager]
    Manager --> Planner[PlannerAgent\no3-mini]
    Planner --> |FinancialSearchPlan| FanOut{Parallel Fan-Out}
    FanOut --> S1[SearchAgent 1]
    FanOut --> S2[SearchAgent 2]
    FanOut --> SN[SearchAgent N]
    S1 --> Collect[Collect Results]
    S2 --> Collect
    SN --> Collect

    FanOut -.-> WD[Watchdog Timeout\nSupervisor]:::opportunity
    S1 -.-> RT[Adaptive Retry\n+ Prompt Mutation]:::opportunity
    S2 -.-> RT
    SN -.-> RT

    Collect --> CP1[Checkpoint\nSearch Results]:::opportunity
    CP1 --> SharedMem[(Shared Epistemic\nMemory)]:::newpattern
    SharedMem --> Writer[WriterAgent\ngpt-5.4]
    Writer -.-> |as_tool| Fundamentals[FundamentalsAnalystAgent]
    Writer -.-> |as_tool| Risk[RiskAnalystAgent]
    Fundamentals -.-> Writer
    Risk -.-> Writer

    Writer --> |FinancialReportData| Verifier[VerifierAgent\ngpt-5.4\n+ Scoring Rubric]:::newpattern
    Verifier --> |Pass| Output([Print Report])
    Verifier --> |Fail + Feedback| Writer
    Verifier -.-> Metrics[Custom Evaluation\nMetrics]:::opportunity

    classDef existing fill:#4A90D9,stroke:#333,color:white
    classDef tool fill:#2ECC71,stroke:#333,color:white
    classDef io fill:#95a5a6,stroke:#333,color:white
    classDef opportunity fill:none,stroke:#E74C3C,stroke-dasharray:5 5,color:#E74C3C
    classDef newpattern fill:#27AE60,stroke:#333,color:white

    class Manager,Planner,S1,S2,SN,Writer existing
    class Fundamentals,Risk tool
    class User,Output,FanOut,Collect io
    

Implementation Recommendations

Phase 1: Explainability & Compliance Currently Not Started

Basic Audit Logging High

Basic pattern

Key capabilities: Agent actions are logged with timestamps, agent IDs, and outcome status to a file or console output; Log entries are generated before and after critical agent operations or decisions; A centralized logging mechanism captures audit trails across multiple agents in the system.

Instruction Fidelity Auditing Critical

Intermediate pattern

A verification pattern that introduces a specialized auditor agent as an automated checkpoint to compare a worker agent's output against its original instructions, ensuring all constraints and goals are met before actions are finalized, thereby enforcing accountability and preventing instruction drift.

Persistent Instruction Anchoring High

Intermediate pattern

A pattern that uses semantically significant tags (e.g., <CRITICAL_INSTRUCTION>, [GOAL]) to embed high-priority goals or constraints within agent prompts, ensuring critical instructions remain salient as they are passed down hierarchical agent chains and are not lost to the LLM's 'lost in the middle' problem.

Causal Dependency Graph High

Intermediate pattern

An auditability pattern that creates a structured, machine-readable record of an entire workflow's data and decision lineage by logging each agent's inputs, outputs, and dependencies as nodes in a traversable graph, enabling root-cause analysis, debugging, and regulatory compliance.

Fractal CoT Embedding High

Advanced pattern

Structured reasoning techniques such as ReAct, Reflexion, and Tree-of-Thought that guide an agent's LLM core through iterative reasoning, self-reflection, and multi-path exploration for complex problem-solving.

Phase 2: Robustness & Fault Tolerance Currently Not Started

Watchdog Timeout High

Basic pattern

A reliability pattern where an orchestrator wraps agent calls with a timed execution block; if the agent becomes unresponsive or hangs past the timeout period, the supervisor cancels the task and triggers a fallback, preventing silent stalls from freezing entire workflows.

Simple Retry Mechanism High

Basic pattern

A robustness pattern that implements an intelligent retry mechanism where, upon a deterministic failure, the system modifies or mutates the prompt (via rephrasing, adding examples, decomposition, or constraint tightening) rather than resending the same request, increasing the likelihood of successful recovery.

Adaptive Retry with Prompt Mutation High

Intermediate pattern

An intelligent retry mechanism that modifies the prompt after a deterministic failure by rephrasing instructions, adding few-shot examples, requesting chain-of-thought reasoning, or tightening output constraints, guiding a confused LLM out of a cognitive failure loop rather than simply resending the same failing request.

Auto-Healing Agent Resuscitation High

Intermediate pattern

A pattern for building self-healing systems where a supervisor continuously monitors worker agents through health checks and heartbeats, automatically restarting agents that crash or become unresponsive, often combined with checkpointing for stateful recovery.

Incremental Checkpointing High

Intermediate pattern

After _perform_searches completes, serialize search_results to disk (JSON or pickle). Before _write_report, check for a checkpoint file — if found, load it instead of re-searching. This is the cheapest resilience win: one file write protects against all downstream crashes. (Ch.7 p.221)

Fallback Model Invocation High

Intermediate pattern

A resilience pattern providing automatic runtime switching from a failing primary LLM to a reliable backup model when the primary experiences outages, performance degradation, or produces invalid outputs, ensuring continuous service availability through graceful degradation.

Rate-Limited Invocation High

Intermediate pattern

A defensive pattern that wraps critical agent tool calls with a rate limiter tracking request timestamps to control call frequency within a time window, preventing agents from overwhelming external APIs or shared services, avoiding service denials, and enabling predictable cost management.

Majority Voting Across Agents High

Advanced pattern

A validation pattern using three or more independent agents to perform the same task in parallel, with an orchestrator tallying their outputs via majority vote; if a clear majority exists the result is finalized, otherwise the system escalates for human review, providing extreme reliability for high-stakes decisions.

Trust Decay & Scoring High

Advanced pattern

A pattern implementing dynamic trust scores for each worker agent, updated based on task outcomes (success increases score, failure decreases it) with gradual decay, enabling orchestrators to self-optimize by routing tasks to the most reliable agents and gracefully sidelining degrading ones.

Canary Agent Testing High

Advanced pattern

A deployment safety pattern (also known as Shadow Mode Deployment) that deploys a new agent version alongside the stable version, routing live traffic to the stable agent while simultaneously sending copies of requests to the canary for background testing, enabling data-driven validation of updates without impacting production users.

Phase 3: System-Level Infrastructure Currently Not Started

Agent Authentication & Authorization Critical

Basic pattern

Security measures ensuring that model access, API keys, and tool credentials are managed securely with least privilege principles and regular rotation to prevent unauthorized access.

Tool & Agent Registry Critical

Intermediate pattern

A centralized registry service acting as a dynamic 'yellow pages' for an agentic ecosystem, where tools and agents register their capabilities, endpoints, and schemas so that other agents can discover and invoke them at runtime without hardcoded knowledge.

Event-Driven Reactivity High

Intermediate pattern

The capability of agents to perceive their operational environment and respond to changes or events within it, a key characteristic distinguishing true AI agents from basic LLM interactions.

Phase 4: Continuous Improvement Currently Not Started

Hybrid Workflow Agent Architecture (Planner + Scorer) High

Advanced pattern

Key capabilities: A planner agent generates workflow plans or task sequences that are passed to other components; A scorer agent evaluates or rates the quality of generated workflows or plans; The system contains feedback loops where scorer outputs influence planner behavior or workflow refinement.

Preference-Controlled Synthetic Data Generation High

Advanced pattern

Key capabilities: Synthetic data generation functions include explicit preference or control parameters in their function signatures; Data generation processes have configured offline evaluation benchmarks or validation datasets; Human-in-the-loop validation checkpoints are implemented with manual approval gates or review workflows.

Custom Evaluation Metrics High

Advanced pattern

A pattern for codifying domain-specific quality criteria into automated, repeatable scoring functions (such as STEPScore) that measure dimensions like factual correctness, logical step ordering, and business rule adherence, going far beyond generic NLP metrics like BLEU or BERTScore.

Phase 5: Human-Agent Interaction Currently Emerging

Human Calls Agent High

Basic pattern

A foundational interaction pattern structuring a direct, transactional request-response cycle where the user provides a specific query or command and the agent quickly classifies intent, selects the best tool, executes it, and returns a concise result.

Agent Calls Human High

Basic pattern

A human-agent interaction pattern in which an agent autonomously pauses its operation to escalate a decision to a human expert when its confidence falls below a threshold, when ambiguity is detected, or when company policy mandates human approval for high-stakes actions.

Agent Calls Proxy Agent Critical

Intermediate pattern

A security pattern introducing a specialized intermediary proxy agent that acts as a secure gateway to external systems, centralizing credentials and API logic so that primary agents never handle sensitive external access directly.

Phase 6: Agent-Level Capabilities Currently Emerging

Agent-Specific Memory (Short-Term) High

Basic pattern

Key capabilities: The agent maintains a conversation_history or session_history data structure that persists across multiple interactions; The agent implements memory summarization or context window management to handle growing conversation state; Agent responses incorporate or reference information from previous turns in the same session.

Context-Aware Retrieval (Simple RAG) High

Basic pattern

Key capabilities: The agent retrieves external documents or data before generating responses using similarity search or database queries; Retrieved context is explicitly inserted into the LLM prompt alongside the user query; The system includes a vector database or knowledge base that stores preprocessed document chunks or embeddings.

Advanced RAG High

Intermediate pattern

Key capabilities: The RAG pipeline includes a re-ranking component that reorders retrieved documents before generation; Query transformation or reformulation occurs before the initial retrieval step; Multiple retrieval iterations are performed with refined queries based on initial results.

Agentic RAG & Graph-Vector Hybrid Retrieval High

Advanced pattern

Key capabilities: The system maintains and queries a knowledge graph database alongside vector embeddings for retrieval; Retrieval queries combine both semantic vector search and graph traversal operations; The agent can dynamically build or update knowledge graph relationships based on retrieved information.

Phase 7: Coordination & Planning Currently Established

Hybrid Delegation Framework High

Intermediate pattern

A market-based agent composition topology implementing a Contract-Net Protocol where a solicitor broadcasts task announcements to bidder agents, who respond with formal bids containing capability, cost, ETA, and confidence scores; the solicitor then awards the task to the agent with the highest utility score for dynamic, runtime task assignment.

Shared Epistemic Memory High

Intermediate pattern

Replace the flat string concatenation in manager.py:126 with a typed SharedMemory class backed by Redis or in-memory dict. Expose via typed tools like get_search_results(topic) rather than passing raw strings. Add TTL timestamps so agents can assess data freshness. (Ch.6 p.197-203)

Consensus, Negotiation & Conflict Resolution High

Advanced pattern

A pattern where agents with conflicting data engage in iterative debate to converge on a shared understanding in advanced multi-agent systems.

Failure Recovery Chain (Ch. 7)

The recommended 5-step recovery chain. 0 of 5 steps covered.

0
% Chain Coverage
Step 1 Simple Retry Missing Retry with exponential backoff
Step 2 Auto-Healing Agent Resuscitation Missing Automatically restart the agent process
Step 3 Fallback Model Invocation Missing Switch to fallback model or redundant agent
Step 4 Delayed Escalation / Agent Calls Human Missing Escalate to human operator with full context
Step 5 Watchdog Timeout Supervisor Missing Timeout terminates unresponsive agent, alerts system

Stress Test: Resilience Considerations

Concrete scenarios illustrating how strengthening partial patterns protects the system.

CRITICAL Without Simple Retry Mechanism → Transient API failure (503, timeout, rate limit)
1
API returns 503 manager.py:106
2
no retry logic manager.py:108
3
exception propagates
4
orchestrator receives unhandled error
5
pipeline stops
6
user sees error, task requires manual restart

Recovery: Implement Simple Retry Mechanism (target: Recovery rate (%); Ch. 7, Table 7.2)

Ch. 7 — Adaptive Retry
CRITICAL Without Instruction Fidelity Auditing → Agent deviates from system instructions or policy
1
Agent drifts from safety guardrails
2
produces non-compliant output
3
drift goes undetected until production review
4
potential regulatory or reputational risk

Recovery: Implement Instruction Fidelity Auditing (Ch. 6)

Ch. 6 — Instruction Fidelity Auditing
WARNING Without Shared Epistemic Memory → Agent B needs context from Agent A's earlier work
1
Agent B repeats Agent A's work manager.py:126
2
duplicated effort and token usage
3
potential inconsistencies between agents
4
output may lack coherence

Recovery: Implement Shared Epistemic Memory (Ch. 5)

Ch. 5 — Shared Epistemic Memory
WARNING Without Event-Driven Reactivity → System state changes that require immediate response
1
Important system event occurs
2
no event bus to propagate signal
3
agents continue normal operation
4
delayed awareness and response

Recovery: Implement Event-Driven Reactivity (Ch. 10)

Ch. 10 — Event-Driven Reactivity
INFO Without Fractal CoT Embedding → Agent produces incorrect reasoning on first attempt
1
Agent generates initial analysis manager.py:49
2
no verification step manager.py:62
3
unreviewed output passed as final
4
downstream decisions rely on unverified results

Recovery: Implement Fractal CoT Embedding (target: Self-correction trigger rate / reduction in final errors; Ch. 9, Table 9.3)

Ch. 9 — Structured Reasoning