Overview

A Retrieval-Augmented Generation (RAG) agent that retrieves context from company reports, peer disclosures, and the Greenhouse Gas Protocol, then generates grounded sustainability insights using Gemini 2.5 Flash.

Technical Solution

The AI Agent is made up of three major components:

Agent Workflow

Agent Workflow

Architecture decisions

Key observations about the data

Some key observations based on the datasets provided:

PDF and CSV integration into the LLM context

PDFs and CSVs were converted into embeddings and stored in Pinecone. The agent retrieves only the top_k relevant chunks as context. This approach has three advantages: (1) handles multiple documents, (2) reduces latency, and (3) lowers token cost.