This document outlines the design of the vector database for the "MyOneTrueAllyPrototype" project.
The core purpose is to enable advanced semantic search for the project's key features, allowing the AI to act as a "true ally." By vectorizing user's personalized data stored in the Memory
table and list items (EntryItem
), we will provide the foundation for Gemini 2.5 Flash to generate highly personalized and context-aware responses.
By integrating a vector database with our relational database, we will build a robust system that enables the AI to accurately understand past user context, preferences, and specific data points (e.g., specific locations, categories).
text-embedding-004
my-one-true-ally
cosine
EntryItem
: No chunking.Memory
: This table stores AI-generated summaries and user-provided, manually registered data.
RecursiveCharacterTextSplitter
will be used to chunk excessively long summaries.text-embedding-004
model to vectorize the user's input.In a vector search system, response speed and cost are critical factors. We will proceed with the design while keeping the following points in mind:
Memory
and EntryItem
) grows, the index size will increase. We will manage Pinecone's index settings appropriately to prevent a decline in search speed.top-k
Tuning: We will adjust the number of results to retrieve (top-k
) to strike a balance between response quality and speed. By narrowing down the most relevant results, we can reduce the number of input tokens sent to the LLM, optimizing both response speed and cost.text-embedding-004
model is priced based on the number of tokens. We need to be mindful of costs, especially when vectorizing long Memory
texts.The vector database system's correctness and speed are crucial. We will perform tests based on the following strategy:
text-embedding-004
model are successful and return vectors of the expected dimensions. We will also validate that data is written to Pinecone correctly.