this project is pretty much dead. The openAI ChatGPT voice mode is just better. That being said, there is something to be explored with actively listening while responding.
This express app exposes a web socket endpoint at /media
. This is the endpoint that Twilio streams audio to. The audio is then sent to the NLP Cloud service for transcription. It's fast enough to transcribe a call in real time and act on the transcription within the call.
You will need a Twilio account to use this app.
You will also need to set up a Twilio phone number to be able to make a call.
Finally, you will need to install ngrok to expose your local server to the internet.
- Clone this repo
- Run
npm install
- create a
.env
file in the root of the project and add the following:
NLP_CLOUD_TOKEN=your_nlp_cloud_token
- Run
npm start
- Run
ngrok http 8080
to expose your local server to the internet (you can use any port you want, but you will need to change the port in the ngrok command and in the TwiML app). - Create a TwiML app and set the Voice URL to the ngrok url.
- Set the TwiML app to run when a call comes in.
- Call your Twilio phone number.
You should receive a text message with the transcription of the call.
To tell Twilio about this endpoint create a TwiML bin and set the Voice URL to this endpoint.
example of a TwiML app:
<?xml version="1.0" encoding="UTF-8"?>
<Response>
<Start>
<Stream url="wss://****-**-**-**-****.ngrok.io/media" />
</Start>
<Say>
Connected to Socket, about to wait for a bit.
</Say>
<Pause length="20" />
<Say>
All done waiting. Good bye.
</Say>
</Response>
After you have created the TwiML app, you can use it in a Twilio phone number. https://www.twilio.com/console/phone-numbers/incoming
Set the TwiML app to run when a call comes in.
MIT - This is free software. Feel free to modify and redistribute it. If you find it useful, please consider giving it a star on GitHub.
If you would like to contribute to this project, please fork the repo and submit a pull request.