What about Apple?

written by

Sahaj Garg

CTO, Wispr Flow

Date

Sep 10, 2025

READ TIME

3 mins

One of the most frequent questions we hear from customers, prospective team members, and investors is: what about Apple? Why won’t they just build Wispr Flow? After all, my laptop does come with a button to speak to it, and Siri seems like an important part of Apple’s strategy!

We hardly worry about this question here at Wispr for the following reasons:

No upper-bound on quality and engagement. For voice dictation to stick, it has to figure out what you wanted to write on the first try, every time. Wispr Flow has been moving in that direction, but is far from perfect out of the box (see more about our open problems and challenges with ASR). The more the system understands a user perfectly out of the box, the less likely they are to switch back to a product that kinda sorta gets them… some of the time. Think about how many of your friends truly get you — we want a voice interface that gets you at that level.
Structural business reasons. The accuracy requirements outlined above mean that we actually have to run reasonably large models on the cloud, and we haven’t saturated the gains from deploying larger models at lower and lower latency. This means material cost for running voice dictation — low enough that it still makes for a good business, but high enough that Apple couldn’t offer the product for free at scale. This disincentivizes Apple (and other players) from innovating voice dictation / voice interfaces that meet the quality bar for high engagement because it would require major business upheval to ship it.
Horizontal product. ChatGPT’s voice mode is great in so many ways, but it doesn’t work natively in every application that I use. As a user, I want a voice interaction that I can bring with me anywhere I’m interacting with my device - not just in a single app. If I had a new keyboard layout for every app I worked with, I’d be so annoyed!
From voice dictation to voice interface. Building a sticky voice interaction is something that almost nobody has solved. That’s because almost all voice interfaces have focused on cool features over habit forming, sticky interactions. Educating users how to use an invisible interface is another challenge of its own. We see several opportunities in building out a voice interface that you can read about here.
Problem scope. We’re working on upleveling human communication & interfacing with devices. If Wispr started and ended with voice dictation, then Apple might be able to enter the market late and dominate. But so long as we’re working in a problem area that’s never been “solved” before (and probably never will be), there will always be impactful things to work on.
Different niche markets. Flow works horizontally, but there are many different (large) markets that experience pain points with voice, from doctors to lawyers to users with accessibility needs to customer support teams to business API customers. Nobody I know today is happy with the offerings of voice products, whether it be $20Bn giants like Nuance Communications that serve medical dictation markets or ASR providers like OpenAI or Deepgram. There’s so much room for specialized solutions in each of these areas that are all individually massive businesses, even if Apple manages to innovate.
Execution speed. With AI products in the modern day software ecosystem, all that matters is execution velocity when obsessing over customers, building a sticky product, and solving problems. The launch of the revamped Apple Intelligence / Siri has been delayed year after year. If we don’t build a good voice interface, who will?

If you’re excited to shape human behavior and solve incredibly challenging technical problems along the way, reach out! The problems in voice AI are far from solved: jobs.wisprflow.ai.

What about Apple?

Start flowing