Introduction

The system accepts data from receipts, bank statements, and third-party applications, then produces a consistent internal representation that supports analysis and advisory queries. These are the same questions people usually answer through manual reconciliation or with the help of a human advisor. The full project is available at https://github.com/capabl-ai/financial_advisor.

The Ingestion Layer

The foundation of the system is the ingestion layer. Financial data in practice is inconsistent by default. Receipts are poorly printed, bank exports vary across institutions, and APIs encode business logic that does not map cleanly to accounting semantics. From the start, the ingestion layer assumes three things:

Input formats are unreliable.
AI output is probabilistic.
Every data point must remain traceable to its source.

All ingestion paths converge into a single normalized transaction schema backed by SQL. No source is trusted enough to bypass validation.

Visual Bill Ingestion

Visual bill ingestion is designed for convenience without compromising correctness. Traditional OCR produces text, which still requires downstream parsing and fragile heuristics. Instead, a multimodal model extracts structured fields directly from receipt images. The prompt strictly constrains the output to a predefined schema and rejects explanatory text entirely.

def generate_structured_bill(image, mime_type):
    prompt = """
    You are an expert financial document analyzer.
    Extract the following fields from the bill image provided:
    - amount (numeric)
    - transaction_type (expense or income)
    - transaction_date (ISO 8601 timestamp)
    - description (short summary)
    - category (food, travel, utilities)

    Return only a JSON object with these keys.
    """

    response = gemini.generate_with_image(prompt, image, mime_type)
    return response

Staging and Validation

Extraction quality depends on image clarity, merchant formatting, and layout. To prevent silent ledger corruption, extracted data never enters the transaction table directly. Every bill first lands in a staging table where validation checks ensure type correctness and format consistency.

try:
    raw_json_string = generate_structured_bill(image_bytes, mime_type)
    structured_data = json.loads(raw_json_string)

except (json.JSONDecodeError, Exception) as e:
    pending_bill = PendingBill(
        user_id=current_user.id,
        image_path=image_filename,
        filename=file.filename,
        error_message=str(e),
        extracted_data=None
    )
    db.add(pending_bill)
    return {"message": "Bill uploaded for manual review"}

This staging approach creates a hard boundary between probabilistic extraction and deterministic accounting. Human intervention is part of the expected workflow rather than an exception.

Third-Party and Bank Integrations

Expense-Sharing Platforms

Platforms such as Splitwise represent social obligations, not financial reality. A direct import would misclassify user behavior. When a user pays for a shared expense, only their portion represents personal spending; the remainder is a receivable. The ingestion logic separates three concepts: cash outflow, personal expense, and settlement balance.

user_paid_share = 0
user_net_balance = 0

for user in expense.users:
    if user.id == splitwise_user_id:
        user_paid_share = float(user.paid_share)
        user_net_balance = float(user.net_balance)
        break

Bank Statement Normalization

Bank statements introduce variability in format, column names, and debit semantics. The parser resolves this through a configurable alias system that maps headers into canonical fields. Once normalized, transactions are enriched through asynchronous categorization to ensure immediate data access for the user.

def infer_transaction_categories(descriptions):
    prompt = f"""
    You are a financial transaction categorizer.

    Rules for Indian transactions:
    - Swiggy = food
    - IRCTC = travel
    - Zepto = food

    Transaction descriptions:
    {json.dumps(descriptions)}
    """

Reasoning and Semantic Retrieval

Once transactions are normalized, the system shifts to reasoning. To support context beyond arithmetic, curated financial writing is indexed and embedded for retrieval. Source material includes books and commentary from established investors, chunked and embedded into a persistent vector store.

User transactions are embedded into a separate collection, allowing for semantic queries like "impulse purchases." This retrieves relevant transactions even without explicit tags, replacing rigid labeling with natural analysis.

Tool Execution and Safety

Language models cannot compute balances reliably. Deterministic tools bridge this gap, each encapsulating a single responsibility with a strict interface. Database connections and user identifiers are injected during tool initialization, not passed through prompts.

tools = [
    AggregateSpendByCategoryTool(db=db, user_id=user_id),
    SQLQueryTool(db=db, user_id=user_id)
]

Constrained SQL Tooling

For flexible querying, the system uses a constrained SQL tool. The database role is read-only, and the tool enforces user scoping and SELECT-only execution.

class SQLQueryTool(BaseTool):
    name = "sql_query"
    description = "Execute a read-only SELECT query on user transactions."

    async def _arun(self, input_dict):
        sql = input_dict.get("sql", "")

        if "WHERE" in sql.upper():
            sql = sql.replace(
                "WHERE",
                f"WHERE user_id = {self.user_id} AND ",
                1
            )
        else:
            sql = sql.replace(
                "FROM transactions",
                f"FROM transactions WHERE user_id = {self.user_id}",
                1
            )

        result = await self.db.execute(text(sql))
        return f"query results:\n{result.fetchall()}"

Multi-Step Reasoning with LangGraph

Multi-step reasoning is handled through a state machine. Using LangGraph, the agent alternates between decision and execution until sufficient information is gathered, allowing for complex aggregation and contextual interpretation.

graph = StateGraph(AgentState)

graph.add_node("model", call_model)
graph.add_node("tools", call_tools)

def should_continue(state):
    last_message = state["messages"][-1]
    if hasattr(last_message, "tool_calls") and last_message.tool_calls:
        return "tools"
    return END

graph.add_conditional_edges("model", should_continue)
graph.add_edge("tools", "model")

Data Architecture

The data layer preserves provenance by storing all transactions in a single table with metadata describing their origin.

class Transaction(Base):
    __tablename__ = "transactions"

    amount = Column(Float)
    date = Column(DateTime)
    category = Column(String)

    external_id = Column(String, index=True)
    external_source = Column(String)

Goals are modeled independently from transactions. A contribution table links the two, allowing a single transaction to fund multiple goals without distorting balances. Relational (PostgreSQL) and vector storage operate in parallel to ensure both arithmetic correctness and semantic retrieval.

Conclusion

The system was built around a single constraint: financial advice must be explainable and reversible. Every number can be traced to its source, and every decision can be recomputed. By relying on structure, validation, and explicit boundaries rather than just prompts, the system meets the minimum standard required for software that interacts with money.

‍

Inspire Others – Share Now

Workshop

Agentic AI Saksham

India’s Only 1st Ever Offline Hands-on program that adds 4 Global Certificates while making you a real engineer who has built their own AI Agents

Workshop

EV
Saksham

India’s Only 1st Ever Offline Hands-on program that adds 4 Global Certificates while making you a real engineer who has built their own vehicle

Explore

Agentic AI LeadCamp

From AI User to AI Agent Builder — Capabl empowers non-coding professionals to ride the AI wave in just 4 days.

Explore

Agentic AI MasterCamp

A complete deployment ready program for Developers, Freelancers & Product Managers to be Agentic AI professionals

Rise of OpenClaw (Previously MoltBot): When AI Takes Over the Work

A Guide to Automated Due-Date Reminders

From Code to Agency: A Student’s Guide to Starting Agentic AI Projects

Introduction | Verifiability and Project Goals

Ingestion Layer | Handling Unreliable Data Sources

Structured Extraction | Visual Document Parsing and Staging

Integrations | Normalizing Splitwise and Bank Statements

Semantic Reasoning | Vector Stores and Context Retrieval

Deterministic Tools | Secure SQL Execution and Tooling

Agent Orchestration | State Management with LangGraph

Data Architecture | Unified Ledger and Goal Modeling

Conclusion | Traceability as the Minimum Standard

Building a Financial Advisor: A Technical deep dive

Introduction

The Ingestion Layer

Visual Bill Ingestion

Staging and Validation

Third-Party and Bank Integrations

Expense-Sharing Platforms

Bank Statement Normalization

Reasoning and Semantic Retrieval

Tool Execution and Safety

Constrained SQL Tooling

Multi-Step Reasoning with LangGraph

Data Architecture

Conclusion

Inspire Others – Share Now

Agentic AI Saksham

EV
Saksham

Agentic AI LeadCamp

Agentic AI MasterCamp

Rise of OpenClaw (Previously MoltBot): When AI Takes Over the Work

A Guide to Automated Due-Date Reminders

From Code to Agency: A Student’s Guide to Starting Agentic AI Projects

Table of Contents

Building a Financial Advisor: A Technical deep dive

Introduction

The Ingestion Layer

Visual Bill Ingestion

Staging and Validation

Third-Party and Bank Integrations

Expense-Sharing Platforms

Bank Statement Normalization

Reasoning and Semantic Retrieval

Tool Execution and Safety

Constrained SQL Tooling

Multi-Step Reasoning with LangGraph

Data Architecture

Conclusion

Inspire Others – Share Now

Agentic AI Saksham

EV Saksham

Agentic AI LeadCamp

Agentic AI MasterCamp

Rise of OpenClaw (Previously MoltBot): When AI Takes Over the Work

A Guide to Automated Due-Date Reminders

From Code to Agency: A Student’s Guide to Starting Agentic AI Projects

Table of Contents

Capabl

Capabl Ecosystem

EV
Saksham