One-liner: Grammarly for your voice. A private, real-time mirror for how you speak, helping you develop mindfulness, confidence, and mastery in your communication.
- AI that listens to all your input on your desktop (for now. Mobile in the future). Whether it's voice messages or speech-to-text, it will listen to everything and provide real-time feedback to help you be more self-aware of your flaws
- poised is most popular but it’s only for meetings
- speeko does something similar but it’s limited to you manually starting a recording and ending it.
- soundcredible.com only focuses on filler words that you tell it to track (it’s basically a filler word tracker)
- I use wisprflow.ai to easily speak the text I want to type (it’s a better speech-to-text than the built-in mac). What I was thinking is whenever we activate Wispr Flow, we also activate our AI. But of course it's not only that, it’s also meetings or voice messages. The user never has to manually activate our AI feedback. It will always be activated only when another app is already activating the mic. So now the question is:
- is that useful?
- Are people willing to pay for something like that? At least for now the API costs
- the current plan is to make it open source, and people can put their own OpenAI API key (0.20 CAD/hr)
- Why would this be good to test?
- I have a Seven day pro subscription for all these communication apps but I can’t bother using them to improve my communication skills. Mostly because they’re good enough and now the work that needs to be done is just being more mindful and practice and being intentional with what to focus on improving and in some way being reminded of that as you go through your day
- And also no matter how much we train, It is so hard to replicate the experience of in person real time interactions in your day-to-day. Our bodies behave in very different ways when speaking in public or social situations vs. just sitting on your couch talking to yourself. This is why I really can’t see myself using any of these platforms for a long time. I just want something that can analyze these real life situations for me and give me feedback on the content of what I’m saying and also the way I’m saying it.
- Maybe that’s all that we need. At least for people who are good, but not amazing
- I remember watching a video from YC talking about how a lot of companies bent the rules on what is permissible. They abused the gray area. Maybe that’s what we shall do
- We could Build a Local Model that would capture only the voice of the user and nothing else gets transmitted or shared to the cloud. This is for the watch integration and phone.
- If we want a real time feature as well, we could make the phone or watch vibrate when the AI detects some flaw in their speech. Could be too distracting for some not sure
- The way I see this being useful is having someone that always got your back and is never afraid to give you feedback on the things you said and how they can be improved.
The Science behind why this works
- Disconnection between speech and thought
- Sometimes our mouth gets disconnected from our brain due to overwhelming external stimulus. Matches what psycholinguists call speech production disfluency under cognitive load
- In noisy or high-arousal environments, working memory is overloaded, and the prefrontal cortex loses temporary control over the speech motor plan. This causes filler words, hesitations, or word retrieval pauses. Studies: Levelt (1989), Piai & Roelofs (2013).
- Humans are blind to their micro-behaviors. Real-time reflection during natural speech aligns with research on metacognitive feedback loops (Schooler, Ericsson). Existing apps fail because they rely on contrived practice sessions.
- Mindfulness & self-awareness
- Mindfulness reduces amygdala reactivity and improves metacognitive awareness (Kiken et al., 2015). In controlled trials, mindful breathing before public speaking lowers heart-rate variability and subjective anxiety (Keng et al., 2011).
- However, mindfulness during speech can backfire if it turns into self-monitoring anxiety (the “observer effect” seen in social anxiety research, e.g., Wells & Papageorgiou, 2001). You risk splitting attention: part on the task, part on watching yourself. Optimal state is “meta-awareness without interference.”
- the goal: practice mindfulness before and after speech; during speech, aim for presence, not analysis
- self-analysis after speaking
- Immediate self-reflection consolidates procedural learning (Ericsson & Pool, 2016). Writing down what went wrong and why creates explicit memory links that improve the next performance.
- We can frame the product as a “verbal mindfulness trainer.”
- Pre-speech breathing reminders
- We have context on their calendar. We would ping them to do a 2 min breathing/speaking exercise before the meeting begins
- Could be very powerful if paired with HRV or Apple Watch and it’s data like heart rate (if it’s too high it will prompt you to take deep breaths)
- Post-speech reflection. It mentions everything that could be worked on and helps you reflect on why and what is happening
- Weekly summary showing:
- reduction in autopilot speech moments
- your scores in different aspects of speech like clarity, confidence, etc…
- While mindfulness is effective, we are still influenced by the blind-spot bias (Pronin, 2007). People miss many of their own vocal patterns: pacing, pitch, filler distribution, turn-taking. Objective feedback (AI or human coach) complements self-awareness. In other words, hybrid is the best approach
- allowing flow during speech (no inner commentary). How can we achieve/enable that? How can we quiet down the voice that is judging us as we’re speaking. The solution seems to be:
- Mindfulness before
- presence during
- reflection after
How should we provide feedback
- Research on attentional load theory shows that feedback must be slightly delayed to be effective.
- Best option: provide feedback after the user stopped speaking for like 3 seconds. In other words after the user finished speaking
- For the desktop app, we will provide feedback through a silent popup that provides feedback
- Haptic nudges only for chronic filler use or rising vocal tension. e.g. when the user has completely lost control of their speech, we would pop a message to help ground them (in terms of tech this would be hard to capture/pinpoint)
Our differentiator
- We blend 3 things:
- journaling & reflection (Daylio, Stoic)
- speech analytics (Poised, Yoodli)
- mindfulness (Calm, Headspace)
- It’s open-source & fully private