Date | 30/04/2025, 17.30 - 17.45 |
---|---|
Participants | • Michael Suthirta |
• Reynard Adrian | |
• Nathanael Mahardika Susilo | |
• Gilmore | |
• Muhammad Rafi Isnaen |
Tasks | Descriptions |
---|---|
Researching about speech-to-text, text-to-speech, and LLAMA API. ( Determine whether to choose sst, tts, and LLAMA third party APIs and how to integrate them. ). | • Already determined the Service that will be used for Text-to-speech, Speech-to-text, and AI Cloud API, namely : Android TTS API, Vosk, and Hugging Face Inference API. |
• Seek knowledge on how these services are integrated into the project. |
The Vosk-based API will capture the voice of the user, then it will be forwarded to the Hugging Face Inference API, which will provide answers based on commands from the user. Then the reply will be read out by the built-in TTS that is triggered by the Android TTS API.
Brief flow: Vosk → Hugging Face Inference API → Android TTS API. |
Tasks | Descriptions |
---|---|
Voice input receiving system | Trying to create a system that captures voice from the user |
• MVP small app | |
Create UI to display the text | • Creating UI for MVP small app |
Speech to text API integration | Connecting a speech-to-text API into app |
Speech-to-text input receiving and conversion | Create a system to convert user voice into text using the API |