This document outlines the design for the backend API of the "MyOneTrueAllyPrototype" project.
The API will define the communication between the frontend (Next.js) and the backend (Next.js API Routes). It will handle user input, retrieve data from Vercel Postgres and Pinecone, and integrate with Gemini 2.5 Flash to generate responses.
This API will serve as the critical foundation for the project's core RAG (Retrieval-Augmented Generation) workflow, enabling the delivery of personalized and context-aware responses to users.
To ensure efficient utilization and stable performance of the API, please consider the following points.
For endpoints that retrieve lists of data, such as a user's memories or histories (/users/me/memories
, /users/me/ad_histories
, etc.), pagination is applied to prevent large data downloads and reduce server load. Use the page
and limit
query parameters to control the range of data retrieved.
To minimize data transfer size, list retrieval endpoints (GET /entry_lists/{listId}/items
) support the fields
query parameter. This allows you to explicitly specify which fields to include in the response. By omitting data-heavy fields like content
, you can significantly improve performance.
The server may include cache-control headers (Cache-Control
, ETag
, etc.) in responses to improve performance. Clients can leverage these headers to reduce unnecessary requests and enhance application responsiveness.
To maintain server stability and ensure fair usage, rate limiting may be applied to some endpoints. If the limit is exceeded, a 429 Too Many Requests
response will be returned. Clients should check the Retry-After
header to determine the appropriate time to wait before making the next request.
Conduct tests that cover the entire user operation flow. This ensures that multiple API endpoints work together correctly.