Limited Time Offer: Watch All Our Bubble Tutorial Videos for Only $99 per Year!

How to record audio & convert to text - OpenAI Whisper API

In this Bubble tutorial we demonstrate how to use OpenAI Whisper with the Audio Recorder & Vizualiser to record someone's speech and convert it into text or a transcript with Whisper. Get started with the Bubble API Connector and Whisper API here.

Thank you to everyone who left a comment on our last Open AI Whisper API video and the recurring theme in the comment section was, can you show how to record audio in Bubble and then send it over to the Open AI Whisper API and get an AI generated transcript back and save that into your Bubble app.

How to record and save audio to Bubble

Well, that is exactly what this app that I'm going to show you does. So let's take let's dive right into the page. We're using Bubbles's own Audio Recorder and Visualizer. There are other ones available in the plugin store to record audio, but this one's just free. I would have a point out that this saves the audio as a WAV format, which might mean that you end up generating slightly larger audio files than an audio recorder that saves it as MP3. But anyway, it works fine for this demonstration.

So I've got the audio recorder elements on the page, and I've got two buttons below it. If we go into my start stop workflow, we'll see that there is an action of start stop audio recorder A. I then need a second action. I've got a save button, and this is an action that the plugin gives you upload content of audio recorder.

This is referring to saving what has been recorded to your Bubble storage, which is part of AWS S3. Anyway, it means basically it's part of your app's storage.

I then want to be able to retrieve that at some point. So I have a data type called audio recording. I have a file field of type File, and I insert into that the results of step one. So that is my way of saving the file. This saves it into my Bubble app, and then this saves me with a way for my database to be able to retrieve that file. I then have below a repeating group that shows all of the data type entries for audio recording. I print the audio recordings file URL. If I hop back to it here, you can see this is where the file is actually hosted, but do notice that it doesn't start with https:. We're going to need to add that in. And then lastly, I've got a button here where I run a workflow that I've labeled get transcript, and then I save the results of the Open AI Whisper API response as text into my data, my audio recording as listed in the repeating group.

If you want help of how to get to this point here, do check out our previous video. I'll put a link in the comments section, and that will show you everything you need to do in order to get this into the Bubble API connector. But anyway, the end result of that is that you can end up with a workflow action where you put a dynamic link for the file. Notice that I've got https:, then the audio recordings file's URL. Believe me, I've just spent 15, 20 minutes making sure that I get all the right formats in place. Effectively, for Open AI Whisper, you need to provide them with a publicly accessible audio file or video file in one of these formats here.

So that's what we're doing with this app. We take a recording, we save it. We create a database entry as well. And then I've added the button here to generate the transcript. So let's give it a test. So if I click Start, I can say I am testing the Open AI Whisper API, and then I can click Stop, and I can click Save, and Bubble is now saving it.

And it does take us a little bit of time on this, I did notice. And I think it's this last one here because, in fact, so I can be absolutely sure, I'm going to say order sort by date created descending. And that way, I know it will be my top one.

Demo: Whisper API generate transcript

Now I click generate transcript, and this is the call to Whisper API. There we go. So if I click Start, I can say I'm testing the Open AI Whisper API. I think that's basically perfect to what I said.

Troubleshooting: Easy misstakes to make

So quick recap, the ways that you're... Well, I'm making an assumption there. The ways that I made a mistake setting this up in order to demonstrate it to you are getting the file formatted correctly in here. So this is because it's a file type field in Bubble, so I can say File URL. In fact, let me show you the data structure there. So I just have my audio recording, File, of type File, and then transcript of Type Text. And then I save the response. So of the response from Open AI Whisper, I get two choices, and so I choose Text. Then where else?

Also, I initially tried to set this up all in one workflow, so basically, I would click Start, then stop, then save it to my database, then send it straight off to the Whisper API. But I found that I kept getting an error back saying that the file that I was providing to Whisper wasn't accessible, wasn't in the right format. So that's why I broke out into this repeating group table. And if I was to go on a hunch, I would say that sometimes my workflow was submitting the file to Whisper before the file was actually properly accessible. We're talking about fractions of a second here, but I think that Bubble might have been passing on the file URL just a little bit before Whisper could actually access it. And that was what was causing the error in Whisper. So by breaking it up into a save command and then a separate Generate Transcript command, I was able to work around that and get my transcripts back.

If you have any questions about this process or anything to do with Open AI or Bubble, please do leave a comment below. If you're really stuck, we provide Bubble coaching.

 

Latest videos

menu