It’s an exciting time for web APIs, and one to watch out for is the Web Speech API. It enables websites and web apps not only to speak to you, but to listen, too. It’s still early days, but this functionality is set to open a whole array of use cases.

Speech Synthesis

If your website has some textual content — whether body copy, forms inputs, alt tags, etc. — you could run some lovely functions and the device would speak the words to the user. You can stop, start and pause the queue, as well as set the language, rate and voice for each utterance.

Speech Recognition

Enables the user to speak into the device’s microphone and have their speech recognized by the website or web app.


Dictation – the user speaks into the mic and the device translates the speech into text .

Voice Control – Dictation could easily be turned into voice control,  which could be modified to allow for navigation around a website.

Translation – Translation would look very different when done in real time. Someone could converse in one language, and another person’s device would speak out what is being said in their own language.


Offline capability needs more consideration. As it stands, Chrome sends the recorded audio to its servers and pings back the result. Thus, an Internet connection is needed for it to work — not ideal.


Nevertheless, it is still exciting, and the technology is opening up. We look forward to the day when looking for the remote is a thing of the past, and I can just tell the TV to stream the latest Sin City movie. Would we actually use the web for this? Why not? It’s already universal. You can take the web and its speech wherever you go. People either can’t see a need for it with the web, or they would feel uncomfortable talking to their device — both valid views.

However, hopefully it’s inspired you to at least give it a go and think about it the next time you are building something. Start welcoming speech: It might be just what you’re listening for.