How to add live web search data to the OpenAI API

Headshot of Matt the Planet No Code Bubble Coach

Need 1 to 1 help?

Your no-code consultant & Bubble tutor.

Learn how to harness the power of Bubble.io and the no-code movement to combine multiple APIs, including OpenAI, web scraping, and web search, to create a Chat GPT clone that can fetch live data from the web and provide accurate and up-to-date information on things to do in Paris in February 2024.

Introducing the Combined Power of Three APIs



I've got something really cool to show you because I've combined three different APIs. I've got the
, and I've got a web scraping API all working together, which means that my Chat GPT clone here, and that's a little bit rough, but it works like Chat GPT can fetch live data from the web and feed that into the conversation.


Improving AI-generated Responses with Live Web Data



And here's a really good example. I'm asking the question things to do in Paris in February 2024. Now, when you ask OpenAI that question through their API, you get back sort of a generalized set of responses. For example, general things that happen in February in Paris, not specifically what happened in Paris in February 2024. And that's because the AI's training data is not necessarily going to have the most up to date information with things to do about events.


Using Brave Search and Web Scraping to Fetch Live Data



So what I do with this when I ask the question is I feed my first message into the
. I then get a list of the first three websites, I pass those over to the page to API web scraper API. I extract the main body content of those pages and then I ask OpenAI to summarize that content. So therefore I'm getting the top three search results for things to do in Paris in February 2024. And I'm passing that data into OpenAI, and I'm going to give you a tour of how to do that.


Taking a Close Look at the Workflow



Staring at screen let me show you how I've got this working. So quick reminder, what happens when I click send? So I trigger the search API when I've got no messages in the conversation. I want to run the search only the first time, only with the first message that's sent, things to do in Paris in February 2024. For example, I then create a new message and this message is saying, write a summary of these web pages based on this statement.

And then I pass in the user's query again things to do in Paris in February 2024. And then I just list out the content of those pages. I'm going to show you how to do that in a moment. I also mark that as visible to user. No, because I'm effectively combining the two messages.

I've got the message that I send to OpenAI and I want OpenAI to be aware of because it's got all of this historical data, all of the important search result data in it. And then I've got create a new message and the next one is effectively the one that I'm displaying back to the user. Visible to user. Yes, because that's the one that they see. Let's just give an example of how this works really well because like the wall of love, I'm going to just assume that that is something that is event based.


Fetching Event-specific Data from Web Scraping



It's going to be there for a set amount of time. And so I know that my web scraping content that is part of this conversation has got a lot more data about the wall of love. So I can say tell me more about the wall of love.


Combined Power of APIs in Action



And so now I'm waiting on OpenAI to respond. Okay, and here we go. I get a response back and I'm just going to assume that this is all correct. I've not been to Paris, but this is based on the
that is in the conversation but it's hidden to the user. I then do the usual thing of sending through all the messages in the correct JSON formatting.


Explore Our Bubble and OpenAI Tutorials



If you're jumping into how to use bubble and OpenAI at this video, do go and check out our other ones because we do cover all of this effectively. What I'm going to be showing in this video is the combination of the three APIs. I then save the response, make it visible to user and assistant, and I reset the input fields.


Using Brave Search and Page to API



So let me just unpack what is going on. The search API is running this
that looks like this.


Web Scraping with Page to API



So I've got my API key in the header, I've got the right key name. I send over my query. This is the one I use to initialize the call, but it's dynamic. And as an example, this is what I get back. And I get back, let's go back to raw data. I get back all of this data about the website.


Fetching Website Content with Page to API



In the previous video where I've been working with the
, I just used the description field, but now I'm digging into the actual content of each page because I'm taking the URL that's returned from the top three results and I'm passing that into the page to API. One important thing to note with the page to API API is that the API key doesn't go in the header, it goes in the body.

So make sure you tick it as private because you don't want your users to access your API key. But what I'm doing is using their API and we go back to documentation. Yeah, it's all explained here and we've got previous examples of how to use the page to API if you've got any questions about that. Where am I going? Here we go.


Summarizing Web Pages with OpenAI



So I'm passing in my API key, my URL. I'm saying use a real browser that tends to get better results. And then I pass and I add in a set of instructions here. And this is not really a secret source, it's not perfect, but it does a good job when you combine it with how well the OpenAI API responds to rough bits of data. Because what I'm saying is create a response that contains something that I've labeled paragraphs.


Fetching Paragraphs with Page to API



And I'm saying go to the page, find all of the paragraphs and return them as text. So that's going to clean up any HTML around the paragraphs. I'm targeting paragraphs because paragraphs is one. It's not perfect, but it's one rough way of aiming at the important bits on the page. Some listicle article, ten things to do in Paris.

The actual content is going to be all in paragraphs. So that's what I'm doing and I'm getting a response back. And I'll give you an example because here's one of the search results so I can initialize the call.


Returning Scraped Paragraphs to OpenAI



Does take a moment because it's having to actually visit the page. And so I get a list of paragraphs in this JSON array, a list of all of the paragraphs on the page. So going back to my
, I could have done this, some clever iteration, backend workflow perhaps, but I'm just running three API calls to the page to API. And so I'm saying scrape the brave search results web item one's URL item two, item three. I'm then using the new customer vent return value. So I've just got site content one, site content two, site content three, and I'm passing the results into each one.


Utilizing OpenAI's GPT-3.516k Model



So this is just going to be a lot of text. There's no HTML mess in it, it's cleaned up, but it's just paragraph comma, paragraph comma, paragraph comma, and I'm returning that in. So then maybe this makes a bit more sense now which is to say, write a summary of these web pages based on the statement things to do in Paris in February 2024. And then I list all of the paragraphs from all three web pages, and that goes into OpenAI.


Expanding Conversational Capabilities with OpenAI API



I will point out that I had to switch into the GPT 3.516k model because the content of all the web pages combined with my prompt, it was about 8000 tokens. But yeah, then it goes into our pretty standard OpenAI API web API call. And yeah, I think we get a pretty cool result because we get that summary data and then we can ask specific questions about it.


Testing AI's Knowledge with Specific Questions



Let's try another one. Here we go. This one might not have come up if we just used the large language model data that is used to train the OpenAI API because it might only be very recent data that is informing us that there is a sports event on February the 6th, the half marathon judo tournament. So let's say, tell us more about the judo tournament. Okay, so this is showing the power of all of these APIs combined because it's so vague, but because it's inserted into this conversation with all of the previous messages, it's going to know I'm talking about the judo tournament in Paris. And really, I'm hoping it's going to know that I'm talking in particular about this one. So let's send.


Final Thoughts on Combined API Usage



Okay, so we are getting back some data. We can't tell whether this is informed by just OpenAI alone or whether it's informed by the content. I'm going to dare ask a question. Maybe this might undermine a slight bit everything that I've shown you so far in terms of how awesome it is, but let's just say what date is gets, again, something that I wouldn't expect the OpenAI API to know on its own. It hasn't got that data.

Okay, I'm going to still say this is pretty cool, and there's obviously some improvements that can be made along the way. But I've just wanted to show in a very rough sense how we can combine three really powerful APIs. We've got the OpenAI text generation with AI API, we've got