How SearchGPT could change the internet economy

Sahil Sinha,Mon Jul 29 2024•searchgpt ads

Google started with a simple idea - let website builders focus on being as useful as they can, and let Google focus on evaluating how useful each website is. That led to the PageRank algorithm in the 90s. In exchange for crawling through your websites data, Google directed invaluable traffic to your website. Websites could then monetize that traffic via ads, brand deals or purchases.

https://upload.wikimedia.org/wikipedia/commons/f/fb/PageRanks-Example.svg (opens in a new tab)

SearchGPT, perplexity and gemini could totally up-end this agreement

LLM-based search (in theory) breaks this agreement. Users get the answer to their questions without ever having to visit a new website.

Now, companies still get to index all of the open web’s data, empowering their models to answer any question their users might have. Except now, the websites no longer get traffic in exchange for their data. The websites themselves have been become secondary to the answer itself,;“relevant sources” underneath a concise summary answering the users question. The websites themselves don’t get the same traffic they used to from traditional, index-based search.

Example SearchGPT query from OpenAI's announcement (opens in a new tab)

In this new world, what incentives to publishers have to grant access to their data? Previously, they were at least getting discoverability and traffic, which they could then translate into revenues. However if users are no longer visiting your website after reading your data, concisely summarized by your aggregator’s LLM of choice, why should publishers make that data accessible?

In short - deals and partnerships. Aggregators are going to need to start buying or trading for access to data. The Reddit OpenAI partnership deal (opens in a new tab) is an interesting first try at this. OpenAI gets access to “enhanced Reddit content” to train its models. And OpenAI becomes one of Reddit’s “advertising partner”. This data is then used to train a general purpose model.

(This leaves an open question regarding how Reddit users should feel about generating all this value effectively for free 🤔)

This means the quality of each GPT-powered search tool will no longer be a function of who can most effectively index data on the open web - but also who can source the best deals with valuable publishers.

What all this means for search? Still to be seen -

Do results get better now that websites can’t out-SEO their way from poor information?
How do websites change now that they no longer have to cater to Google Search’s tyrannical SEO guidelines? Do we see more diversity in webpages, similar to web 1.0? (Personally I’d be in favor of a resurgence in angelfire websites)
Do we see search going the way of online media; forgoing free, ad-supported products for subscriptions? Will consumers have to purchase multiple “search subscriptions”, the way we need to stack streaming services?

Shout out to OpenAds (opens in a new tab) a company building something in this new space!

👨‍💻

📣 If you are building with AI/LLMs, please check out our project lytix (opens in a new tab). It's a observability + evaluation platform for all the things that happen in your LLM.

📖 If you liked this, sign up for our newsletter below. You can check out our other posts here (opens in a new tab).