1 Introduction to voice first

This chapter covers

  • Understanding voice first
  • Designing for voice
  • Picturing how computers listen to speech
  • Understanding how computers speak

The best technologies come to our lives in two ways: gradually, and then suddenly. This is how it went with voice. For decades we dreamed of computers with whom we could speak. We gradually got some of that, though it was particularly domain-specific and not something most people interacted with daily. Then, all of a sudden, a product came from Amazon that has millions of people wondering how they ever got by without it. The Echo brought voice-first computing to our daily lives.

If you’re reading this book, you’re probably already aware of the draw of voice-first platforms. Perhaps you even have a voice-first device, like an Amazon Echo or Google Home, in your living room. Because of this, it can be difficult to remember just how out-there the idea of voice first was when Amazon announced the Echo a few years back. If you were to look at articles commenting on the announcement and count the number of paragraphs before you saw a variation on the phrase “no one saw this coming,” you wouldn’t be counting for long.

1.1 What is voice first?

1.2 Designing for voice UIs

1.3 Anatomy of a voice command

1.3.1 Waking the voice-first device

1.3.2 Introducing natural language processing

1.3.3 How speech becomes text

1.3.4 Intents are the functions of a skill

1.3.5 Training the NLU with sample utterances

1.3.6 Plucking pertinent information from spoken text

1.4 The fulfillment code that ties it all together

1.5 Telling the device what to say