The Age of PageRank is Over

09 Nov, 2019

When Sergey Brin and Larry Page came up with the concept of PageRank in their seminal paper The Anatomy of a Large-Scale Hypertextual Web Search Engine (Sergey Brin and Lawrence Page, Stanford University, 1998) they profoundly changed the way we utilize the web. For the next 25 years, humanity counted on their algorithm to deliver relevant results for its searches.

PageRank generated its results based on the idea that websites that link to other sites would be most valuable if they were based on merit rather than commercial motivation. The web was still young, conceived to be a force for good, sharing, personal expression, and unifying the world. The algorithm was a huge success. Inspired by how citations were used to “rank” academic papers, pages with links from a more significant number of other pages got a better “page rank,” which led to a fast and efficient way to produce the most relevant results for any query.

That was a fantastic breakthrough, but something started happening over the years. As some websites became more prominent thanks to their page rank (which was well deserved!), their publishers also realized they could monetize the traffic they started receiving. At the same time, search engines also discovered that ads are very lucrative.

This quickly led to ads becoming the dominant business model of the web. And the proliferation of ads brought another thing with it – a conflict of interest. Whether it is an ad-supported search engine or an ad-supported website, their users and customers suddenly have two different interests. Their user usually just wants to browse or search the web, while their customers try to sell things to that user.

Over the years, the web deteriorated to the state it is in now - a highly destructive force. Much of the damage is driven by the monetization of users and every aspect of their lives. Enterprises capture our preferences, our friends, our families, the information we consume, and the information we create. They manage and maximize for their benefit our preferences, our opinions, our purchases, and our relationships. The web can poison individual opinions, freedoms, and political and social institutions. It steals from us, addicts us, and harms us in many ways.

The websites driven by this business model became advertising and tracking-infested giants that will do whatever it takes to “engage” and monetize unsuspecting visitors. This includes algorithmic feeds, low-quality clickbait articles (which also contributed to the deterioration of journalism globally), stuffing the pages with as many ads and affiliate links as possible (to the detriment of the user experience and their own credibility), playing ads in videos every 45 seconds (to the detriment of generations of kids growing up watching these) and mining as much user data as possible.

Ads became a global “tax” on using the web, paid mostly by the most vulnerable, non-tech-savvy, users.

And naturally, the quality of web search started to deteriorate.

The sad truth is that it was all predictable. In the same 1998 white paper, Mr. Brin and Mr. Page sharply criticized the ad-supported business model that other search engines used at the time (Appendix A: Advertising and Mixed Motives; emphasis mine):

“Currently, the predominant business model for commercial search engines is advertising. The goals of the advertising business model do not always correspond to providing quality search to users. For example, in our prototype search engine one of the top results for cellular phone is “The Effect of Cellular Phone Use Upon Driver Attention”, a study which explains in great detail the distractions and risk associated with conversing on a cell phone while driving. This search result came up first because of its high importance as judged by the PageRank algorithm, an approximation of citation importance on the web [Page, 98].

It is clear that a search engine which was taking money for showing cellular phone ads would have difficulty justifying the page that our system returned to its paying advertisers. For this type of reason and historical experience with other media [Bagdikian, 83], we expect that advertising funded search engines will be inherently biased towards the advertisers and away from the needs of the consumers. … Furthermore, advertising income often provides an incentive to provide poor quality search results. … In general, it could be argued from the consumer point of view that the better the search engine is, the fewer advertisements will be needed for the consumer to find what they want. This of course erodes the advertising supported business model of the existing search engines”

Yet, despite being acutely aware of the dangers of ad-supported search, selling ads was adopted as the primary business model of the new search venture just a few years later.

And the consequence - the potential of the greatest search technology the world ever saw and some of the most brilliant people in the world became limited by the business model with an inherent conflict of interest built into it. The web changed, driven by the same relentless ad-supported monetization. The very algorithm - PageRank - broke because nobody links to or curates content anymore. If they do, it is mainly for commercial benefit, not based on merit, which was the essence of both the original web and the algorithm.

This led to the concentration of power, with the same 100 or so largest websites showing nowadays in almost all searches by mainstream search engines. It further exacerbates the problem as smaller sites and amateur blogs do not surface in search results for people to discover and link to. The primary purpose of the web today is now “engagement” - or to translate from product management speak - “How many ads we can push down users’ throats.”

Author and political scientist Ian Bremmer remarked, “The idea that we get our information as citizens through algorithms determined by the world’s largest advertising company is my definition of dystopia.”

The age of PageRank as the model for finding the best pages on the web is over, with the algorithm ending up being polluted and entirely dominated by ads.

Nowadays when a user uses an ad-supported search engine, they are bound to encounter noise, wrong and misleading websites in the search results, inevitably insulting their intelligence and wasting their brain cycles. The algorithms themselves are constantly leading an internal battle between optimizing for ad revenue and optimizing for what the user wants. In most cases the former wins. Users are given results that keep them returning and searching for more instead of letting them go about their business as soon as possible.

This process produces self-enforcing monopolies in almost every sphere of online life - search, news, entertainment, social media… All these monopolies have two things in common:

They are a product of advertising-based business models;
They are unhealthy for our digital society and an antithesis to what the internet was supposed to be - fun, quirky, and exciting. Instead, they attempt to control almost every aspect of our online life and culture.

And this is why we built Kagi. We felt a strong need to stop this madness and reverse the direction the web is heading in. The main reason Kagi exists is to offer a radically different view of the web, one close to its original intention and one in which the users and their needs are in the center of the universe.

The future of search is user-centric

Not only are we living increasingly busier lives that require access to timely and high-quality information, but as civilization gets more sophisticated, we are starting to realize that we should be careful about what information we let into our brains, just as we are careful about the food we put in our bodies.

In a world like this, there is very little room for ads and noise. Yet this is how the world has functioned for the last 25 years.

With the inevitable advancement of our civilization, it is reasonable to predict that most of humanity is in for a rude awakening from a world in which harmful agendas driven by misaligned incentives dominate our lives. The shock and realization of how information is really important and how we are currently being treated may feel similar to waking from a coma, like the one that controlled humanity in the movie The Matrix. We’ll look at the current situation in hindsight and wonder “How did this all happen?”

In the future, it is likely that if the current mainstream search engines want to survive, they will have to go back to their roots, dismissing ads as their primary business model (as described by Mr. Page and Mr. Brin in their 1998 whitepaper) and start optimizing for what the user wants. This seismic shift is not a matter of if but when. If nothing else, it will be driven by the erosion of public trust in information served by companies using ad-supported business models.

Then, imagine a world in which companies use all their resources, technology, and human potential to create entirely user-centric products. This will drive innovation as yet unseen.

We will have search products with different capabilities. There will still probably be some “free” ones, ad-supported, which will not return very high-quality information and will optimize for ad revenue instead. They may even have the “for entertainment” label, as found on some “news” sites today.

But there will also be search companions with different abilities offered at different price points. Depending on your budget and tolerance, you will be able to buy beginner, intermediate, or expert search “companions”. They’ll come with character traits like tact and wit or certain pedigrees, interests, and even adjustable bias. You could customize your “helper” to be conservative or liberal, sweet or sassy!

In the future, instead of everyone sharing the same search engine, you’ll have your completely individual, personalized Mike or Julia or Jarvis - your own personal assistant. Instead of being scared to share information with it, you will chose what data you want it to have and volunteer your data only after knowing its incentives align with yours. The more you tell your assistant, the better it can help you, so when you ask it to recommend a good restaurant nearby, it’ll provide options based on what you like to eat and how far you want to drive. Ask it for a good coffee maker, and it’ll recommend choices within your budget from your favorite brands with only your best interests in mind. The search will be personal and contextual and excitingly so!

The most sophisticated ones will be able to answer questions requiring them to digest pages of documents, even entire books or videos, to come up with a 200-word summary.

And yes, the non-zero price point will mean you have to budget it with your other costs. But faster access to higher quality information will make you much more competitive globally, so you can decide if the investment will be worth it, like any other purchase you make. This will in turn incentivize these products to be even better, a positive feedback loop driven by entirely aligned incentives.

This is a vision of the future that will finally allow the internet to reach its full potential as the amazing tool it could be rather than the exploitative and one it is now.

I hope you join us on this journey.

[Update: We published the first version of this blog post in 2019, long before large language models were a thing. We do not believe that current generation of LLMs will take us to this vision due to extreme limitations. What we are talking about here is perhaps decades away and is more like “Star Trek” or Apple’s 1987 “Knowledge Navigator” concept. The main point is that user-centric information retrieval, whatever it looks like in the future, is only possible with alignment of incentives achieved through a paid business model. Ads do not belong in search.]

Vladimir Prelovac
CEO, Kagi Inc.

#guides