Build an AI Voice Agent by Integrating OpenAI's Real-time Speech API with Plivo

Plivo helps businesses leverage OpenAI’s cutting-edge Real-time Speech-to-Speech (S2S) capabilities through seamless integration with Plivo’s Audio Streaming API. This powerful combination enables you to create sophisticated AI voice assistants that can engage in natural conversations, handle interruptions gracefully, and provide real-time responses to user queries.

Get started with Plivo

Before beginning your AI voice assistant development, sign up for Plivo or sign in to your existing account. You’ll need to purchase a voice-enabled number through the Voice API or Plivo console.

Prerequisites

Ensure you have the following before starting:

Node.js version 22.6.0 or later (download here)
Python version 3.10.5 or later (download here)
A Plivo account with a voice-enabled number
An OpenAI account (sign up here)
- Valid API key
- Access to OpenAI’s Real-time API
ngrok installed for local development testing

Clone the Plivo audio stream integration guides repository

git clone https://github.com/plivo/AI-Voice-Agents.git
cd AI-Voice-Agents/Openai-realtime-api/Python

git clone https://github.com/plivo/AI-Voice-Agents.git
cd AI-Voice-Agents/Openai-realtime-api/NodeJS

Setup Your Local Environment

Create a Tunnel with ngrok For local development, you’ll need a public URL to receive webhooks. Open a terminal and run:

ngrok http 5000

Copy the Forwarding URL (format: https://[your-ngrok-subdomain].ngrok.app). You’ll need this for the Plivo Answer XML. Note: The port 5000 is this application’s default. If you change the PORT in index.js (in case of Node) or server.py (in case of Python), update the ngrok command accordingly. Remember that each new ngrok session creates a new URL requiring configuration updates.

Install Required Packages

pip install -r requirements.txt

If you are using Node.js:

npm install

Configure Environment Variables

Create a .env file in your project root and set up the following:

Add Plivo Credentials

PLIVO_AUTH_ID=<YOUR_PLIVO_AUTH_ID>
PLIVO_AUTH_TOKEN=<YOUR_PLIVO_AUTH_TOKEN>
PLIVO_FROM_NUMBER=<YOUR_PLIVO_NUMBER>
PLIVO_TO_NUMBER=<CALLER_PHONE_NUMBER>

Add OpenAI API Key

OPENAI_API_KEY=<YOUR_OPEN_AI_API_KEY>

Configure Answer XML

Use this template for your Plivo application’s Answer XML:

<?xml version="1.0" encoding="UTF-8"?>
<Response>
        <Speak>Connected to AI Assistant. You may begin speaking.</Speak>
 <Stream keepCallAlive="true" audioTrack="both">
       wss://[your-ngrok-subdomain].ngrok.app/stream
 </Stream> 
</Response>

Update the PLIVO_ANSWER_XML variable in your .env file with your Answer URL.

Launch Your Application

Ensure ngrok is running and you’ve noted the Forwarding URL
Verify all environment variables are properly configured
Start the application:

python server.py

node index.js

The application will automatically initiate a call to the number specified in PLIVO_TO_NUMBER. Once the call is answered, you can begin interacting with your AI assistant.

Key Features

Your AI voice assistant includes:

Real-time audio streaming through Plivo’s WebSocket
Natural voice communication using OpenAI’s Real-time model
Intelligent interruption handling for natural conversation flow
Function calling support for enhanced capabilities
Bi-directional audio streaming for seamless interaction

Troubleshooting Guide

If you encounter issues:

Check WebSocket Connection:

Verify ngrok is running
Confirm the WebSocket URL in your Answer XML matches your ngrok URL
Check for WebSocket connection errors in your logs

Verify Environment Setup:

Confirm all environment variables are correctly set
Ensure OpenAI API key is valid
Verify Plivo credentials are correct

Audio Issues:

Check audio stream configuration in Answer XML
Verify audio format compatibility
Monitor WebSocket data transfer logs

Next Steps

Consider these enhancements for your AI assistant:

Implement custom conversation flows
Add specific business logic through function calling
Create detailed conversation logs
Add support for multiple languages
Implement analytics and monitoring

For additional support:

Visit Plivo Documentation
Check OpenAI API Documentation
Contact Plivo Support for technical assistance

​Get started with Plivo

​Prerequisites

​Clone the Plivo audio stream integration guides repository

​Setup Your Local Environment

​Add Plivo Credentials

​Add OpenAI API Key

​Configure Answer XML

​Launch Your Application

​Key Features

​Troubleshooting Guide

​Next Steps