The Long Tail’s Impact on Search Relevance

Guest Author
Melek Pulatkonak
President - Hakia
Search for Meaning
Long tail discussions in search mainly revolve around advertising, in particular around ways of using tail keywords to bolster ad campaign performance. I would like to pose a different question today: What is the impact of the long tail to search relevance?
For starters, I would like to refresh your memory regarding the long tail statistics in search. The long tail is estimated to account for over 95% of the search volume. As far back as in 2001, see the Excite query distribution chart below, the area under the long tail comprised 97% of the search query volume.
The long tail phenomenon persists and is confirmed by the recent statements from large search players. A Google spokesperson stated that “20 to 25% of the queries we see today, we have never seen before”and Ask.com’s CEO, Jim Lanzone, was quoted as saying “On any given day, 60% of the search requests we get, we have never seen before” . Jim gave a great presentation (“New search engine relies on power to find the best search results,” Associated Press, May 30, 2007) at the Web 2.0 conference in 2005 and I will share two data points with you:
o Long tail of searches make up 95% of the queries
o Head query= 1.57 keywords. Tail query=5.01 keywords.
So far, we have established three facts: 1) The area under the long tail comprises more than 95% of the search volume; 2) Long tail queries are longer, unique and complex; 3) The majority of queries are as unique and complex as we are- complicated creatures with different information needs and use of language.
Now, let’s move on to the relevancy discussion.
Today’s search retrieval technology relies on popularity systems in one form or the other. The number of possible queries that can be asked to a search engine using three words or more is huge compared to the available statistical material (number of link referrals). Thus, popularity- based engines can augment only a tiny fraction of long queries and then only the most popular ones. Any long query referring to a slightly unpopular topic will never make use of enough votes to improve search relevance.
It sounds like a conflict doesn’t it? Long tail is a phenomenon in search. The volume of long tail search queries makes up almost the full universe. Yet, today’s popularity search systems cannot improve relevance in the long tail due to a design-imposed constraint. This is exactly what we observe today: current search engines satisfy most people for their most common and often shortest queries, most of the time. 50% of the searches go unanswered.
The answer to the question, “what is the impact of the long tail to search relevance?” is simple: To improve relevancy, try a new approach that moves away from popularity systems.
It is exciting to see so many young companies taking up the challenge to build for the future.











December 20th, 2007 at 3:53 pm
Melek:
Great post! I agree - major search engines have long trained users to provide tiny scraps of data (keyword-ese!) to find the information they are looking for. However, since the search engines cannot yet read minds, the best they can do for results is revert to the mean, which means that Long Tail results get swamped by the most popular results.
Having said that, I’m not sure what you are proposing in terms of a solution to this problem. You say: “To improve relevancy, try a new approach that moves away from popularity systems.” Could you be more specific? For example, how would Hakia help to improve results relevance for long-tail queries?
December 20th, 2007 at 9:38 pm
Hi Nitin,
Thank you for the comment & question.
Semantic search can help solve the problem by understanding the query and Web text. We will not be able to replicate the human mind but understanding natural language will be a first step in that direction.
hakia’s goal is bring more relevant results both in the head and the long tail. As we progress further in the development, the performance difference will be more visible. Keep on checking us at hakia.com
December 23rd, 2007 at 3:08 am
[...] 原文:The Long Tail’s Impact on Search Relevance [...]