Until I tried it myself, it puzzled me why the browser's text to speech API is so underused. Now my understanding is that it's just not up to high standards.
It's really easy to start using it. Here's the bare minimum:
if (window.speechSynthesis) {
const utterance = new SpeechSynthesisUtterance("Hello, I love you. Won't you tell me your name?");
utterance.voice = window.speechSynthesis.getVoices()[0];
window.speechSynthesis.speak(utterance);
}
You can find all the available docs here.
I added it to my blog. While reading something, I sometimes wanted to be able to listen to an article, rather than read it. But the end result is not as great as I imagined. The voices sound very robotic. The implementation varies between platforms, but still there's a very distinct metallic flavour. It's fun. But the selection of actually bearable voices is very small.
Another inconvenience is that the API is limited. For example, there should be pause/resume methods and the paused property of the speaking object. But it's just not there. So you can just start and stop the speaker without the ability to pause and resume. In real-life scenarios, it's a huge UX flaw. Simply unusable.
But, at least I finally tried it and now, with great confidence, I can recommend to anybody not to use it.
