Detecting speech inputs

GetInput’s automatic speech recognition (ASR) feature is ideal for accepting both unstructured and structured speech input from users. Structured inputs, in the form of keywords and commands, are suited for use cases that have a finite set of distinct operations for users to choose from, such as interactive voice response (IVR). Adding speech detection to DTMF-driven IVR menus can improve conversions by offering users an easier alternative to navigate through menus, as in this first example.

Example

<Response>
    <GetInput inputType="dtmf speech" action="<action url>">
        <Speak>Press 1 or say New Appointment to schedule an appointment. Press 2 or say Cancel Appointment to cancel an existing appointment.</Speak>
    </GetInput>
</Response>

Real-time transcription of fuzzy inputs such as complete sentences, on the other hand, helps to build conversational AI-driven experiences.

Example

<Response>
    <GetInput inputType="speech" action="<action url>">
        <Speak>Welcome to Mary’s Hair Salon. How can I help you today?</Speak>
    </GetInput>
</Response>

An easy way to build AI conversational interfaces is by passing transcribed speech received through the GetInput XML element to AI chatbot platforms such as Google Dialogflow for NLP-based intent extraction. Also read about how the Plivo Speak XML element’s Speech Synthesis Markup Language (SSML) engine can be used to make your bot’s responses sound natural.

Voice Agent Integration Guides

Getting Started

Use Case Guides

API Reference

XML Reference

Voice Concepts

Troubleshooting

Detecting speech inputs