Ask a question
Exploring OpenAI's New Features: Text to Speech
I'm having such a fun time using the new features added to the OpenAI API. It's November 7th, this was all added yesterday, and in this Bubble tutorial video I'm going to demonstrate to you how we can add text to speech. And honestly, this is some of the best text to speech that I've heard. I was saying to my family last week, in fact, that we've now got AI text generation that is often better than 9 out of 10 people in the room in the ability to write well English, write English well, yeah, see a problem there. And that we've got image APIs that are now getting so close to being able to provide photo-realistic images that came from nothing, came from the model. But what we are missing is the ability to speak to one another, or AI to speak to us in a way that is convincingly human. And I think that OpenAI have got really close with this one.
About Planet No Code: Your Bubble Education Resource
But before I launch into it, we are Planet No Code and we are a Bubble education resource. If you want to build a SaaS online, if you want to launch a business online and you are not a coder, then look no further because we've got hundreds of videos using this platform here. It's called Bubble. It allows you to build software without using any codes or maybe just a little bit of code. But when you need codes, I'm here to explain it. And this is what this does.
Demonstrating OpenAI's Text to Speech Integration in Bubble
So I will click speak and it submits the text. Quick brown fox jumps over the lazy dog. And hopefully that came through. But you see, I've got an embedded audio player here. Now let me show you exactly what's going on. We're using this guide here from OpenAI. And so I'm in their documentation, I'm in their text to speech section. And we basically need to take this and plug it into our Bubble app.
Setting Up OpenAI API in Bubble
So let me show you how I've done that. If I go into plugins, I've got my OpenAI API setup here. And this is all in the Bubble API connector plugin. Now there are loads of plugins available to Bubble, but often if they're integrating with a third-party service, they're not actually adding anything to it that you can't do yourself by looking at that third-party, in this case, OpenAI, looking at the API documentation and building it up yourself.
Configuring OpenAI API Authentication
So we've got the label I've given it, OpenAI. We need to authenticate the call and we authenticate it with the private key in the header using the label authorization with the word bearer preceding our API key. How do I know that? Well, I go back into the documentation and I see that in the header of the call, I have authorization bearer API key. I also need the content type application JSON and my endpoint here, v1 audio speech. And so you will see that in my Bubble app, I've got my endpoint, I've got it set as an action because I want to be able to trigger this in a workflow. I'm going to show you that in a moment. I say that the returning data is a file because OpenAI just responds with an MP3 file, ready to play the audio.
Setting Up the Body Section
And then in the body section here, I look to the documentation and all I did was copy everything here, not including the outside quote marks, but everything within the curly brackets. And I also changed the voice because having listened to all of these yesterday, I think the best one is Onyx. But leave a comment below if you think that a different one of the voices is more convincing. I'd love to see and read your thoughts there. And then I've just added in using the triangle brackets, basically a merge tag or a variable or a dynamic value. And I've got my text in here and then I clicked initialize and that told me I didn't have any issues or errors. And that then means that OpenAI text to speech becomes an action that I can add into my workflow.
Adding OpenAI Text to Speech Action into Bubble Workflow
So I've got my text box here and I've got my button and I will go on there and I will have gone in and I will go plugins and I'll go text to speech and add it in. And it would look just like this. I then say here is the text that I'm sending over. And remember, it replies back with a file and I knew that would happen because the output is just an MP3 file. And so I have to do something with that in order for Bubble to be able to work with it. And I'm just using a custom state.
Understanding Custom States in Bubble
Now custom states are a way of temporarily storing data. What I was thinking about is surely the file is saved somewhere but what I mean by temporary storing data is that I'm not saving the file to my database or I'm not saving a value to my database. But I am actually saving the file to the database. Okay, that's a little bit confusing but basically it means that I'm not creating an entry, there we go, an entry in my database for the file. But I am actually, I have to have some sort of storage because I need to be able to refer to it. And so you can see here that it is saving them to my Bubble app storage. But I need a way of retrieving that. So I need to be able to know the location of the saved file. And that's why I've got a custom state.
Setting Up HTML5 Audio Player
And so I've got a custom state on the page of type file and I've labeled it file and my page is called TTS for text to speech. And so I say set state of element TTS, custom state, label this file and then it's just the result of my first call. Now, how do I get it to autoplay and how have I got this audio player here? Well, it's just an HTML5 audio player. And so I've got my audio element here and I say show the controls and also autoplay. And that just means that as soon as I put a file in there it's going to play it. And then the source is my custom state URL. And then I just copied this, I think, from a website that gave me the code. And so there's that little disclaimer in there about the browser not supporting the current audio element that that would only really nowadays apply to very old browsers.
Testing OpenAI's Text to Speech in Bubble
Right, let's give it another test. So if I refresh the page, you'll see that there is not a file. There is nothing to play. But if I say speak, The quick brown fox jumps over the lazy dog. There we go. And let's take something a little bit more, a bit more of a weighty piece of text. And let's try that instead. Now, I reckon I'll get syntax error here. To be or not to be, that is the question. Whether it is nobler in the mind to suffer the slings and arrows of outrageous fortune. OK, I was wondering if I'd get a JSON syntax error because of the use of the colon here and where I need to make it JSON safe. But it seems like it actually works just fine.
Final Thoughts on OpenAI's Text to Speech
So there you go. That is how we can use text to speech in a Bubble app. And I think that OpenAI's text to speech is the best in the business. It literally only came out yesterday. If there are any other text to speech models that you think are better, please leave a comment down below. If you've got any questions, leave a comment down below because we read every single one and they inspire us to make even better videos in the future.