Bubble with Speech to Text using AssemblyAI - Part 3

Headshot of Matt the Planet No Code Bubble Coach

Need 1 to 1 help?

Your no-code consultant & Bubble tutor.

In the final part of this Bubble tutorial series using the AssemblyAI Speech to Text API we create a simple front end form for users to upload an audio file and receive a transcript.

Welcome to part 3 of our Bubble tutorial series looking at AssemblyAI and speech recognition speech to text using the AssemblyAI API.

So far in this tutorial, check out the earlier two videos. We've looked at how to set up an API call to AssemblyAI and how to use webhooks to be notified when the transcript is ready and retrieve that from our Bubble app.

This last tutorial, we're putting it all together by building a graphical user interface so that a user can upload a file and then get a transcript back.

Building a page and form

So to begin with, I'm going to create a new page and we'll call this transcript. And I'm going to add in a file uploader and a button and a text field. And I'm going to put the text field into a group because I'm going to be passing data into it. So text field as a group. And then the group is going to be of type message. These are data types that I had set up for a previous video. I'm just using it as a way to store the transcript when it arrives back to me. And I'm going to say, do a search for message and get the first one where they're ordered by created date sending.

So I get the most recent result. And then what I'm going to do with my text field is say a groups message. Again, these are from a previous demonstration of chatGPT API, GPT-4. I'm just going to use content. Simply doesn't matter it nor does structure. I just am finding somewhere in my database to save the text response that I get back. And so I'm going to call this generate transcript.

Workflows

So what's it going to do? Let's create a workflow. So first of all, I go to plugins and I have these options here. I have these options because in the previous video, I've set up the AssemblyAI API, and I've got two calls. I've got the transcribe audio file, which sends the audio file. I then wait for a webhook, and when I get that notification, I then go and retrieve the transcript from the AssemblyAI servers. And that's the second call here.

So my first call is to transcribe the audio file. And I need to send the file that I upload. File upload as value, and I need its URL. And then I need my webhook. And the webhook that I need is set up in my backend workflows and is here.

Webhooks and Bubble app versions

But I'm going to account for the fact that I've got a development and a live version of my app and I want it to work in both. So I replace... In fact, I'm going to copy this onto the clipboard in case Bubble overrides it. And then I'm simply going to remove everything there and say home URL. And I'm just going to check on this page here whether home URL includes slash at the end. If I click preview. Yes, it does. So there we go. So you can see that my home URL takes into account the version of the Bubble app that I'm using. So yeah, my home URL. And then that's going to adapt, whether it's the live version or the Dev version of my app.

Now, from experience, there's one other thing to do here that when Bubble expresses a file's URL, it doesn't include HTTP, in fact, HTTPS and a colon. It starts with slash slash and then goes into the URL. So I'm adding that in there. And then I think I'm probably at a good place to test it, in fact. So let's go preview. And I'm going to upload my audio file.

And I'm going to delete... It's the audio file I've been using to test this whole process. So I'm actually going to delete the... In fact, I'll delete all messages. There we go. Right. And I'll try uploading that again. Okay. And I'm going to click Generate Transcript. Okay.

So Bubble has now sent the audio file to AssemblyAI, and now it is waiting to receive the webhook. And then if the webhook is activated, it should create a new message with the content. So we're going to give it a few moments just to see how that's going. If that hasn't worked, we will debug it. No, but it has worked. There we go. So you can see here that there is a process of waiting, but it is not waiting on something to load on the frontend. It is being taken care of backend workflow processes. So you might want to display some estimate of how long it can take, like a spinner, a loading, something or other here, because otherwise, your users are likely to spam the button here, and then you just end up having an oversized API bill because they've submitted their audio file 10 times and they're being impatient.

So I would suggest some workflow up here which hides the button, shows the message to them how long it's going to take. And then you have some provision like I've got here with Do a Search for the latest message in order to show when the message is actually created.

As a quick reminder, I'll show you how that works. So this notification is sent into here from AssemblyAI via the webhook that I sent out with the initial audio file. So the data comes in here, the notification. I then do another request to AssemblyAI to get the transcript using the transcript ID, and that's then when I create the message using the results of step one.

There's no waiting around here because the transcript is ready. This is one of the key differences between Whisper and AssemblyAI is AssemblyAI has this webhook feature. So although you might actually have to wait longer, you're not dependent on a connection timing out or being maintained. It's something that you can retrieve once it is ready.

So there we have it. If you have any questions, if you'd like to see other demonstrations of other services, other APIs, please do leave a comment.

We read every single one of them. We try and reply with a detailed explanation where we can. And yeah, see you in the next video.