Creating the Unofficial Homepage for every Topic

Our Guest Author today is Anand Rajaraman – Co-Founder, Kosmix
One of the most significant developments on the Web over the past three years has been the rise of Wikipedia. Wikipedia has come from nowhere to become a Top 10 site on the Web. Its monthly page views are measured in the billions. A significant proportion of people who search on Google or other search engines end up on Wikipedia.
Why is Wikipedia so popular? Because it is a great place to get an overview of a topic.
When I’m exploring a topic, the first stop is usually Wikipedia, to get an overview of the facts. But the Web today has way more than facts. It has opinions, community, video, audio, and images. It has reviews, ratings, products, forums, widgets, and gadgets. Each of these information types complements the factual information on a topic and figures heavily in our everyday decision-making.
The Kosmix Topic Homepage is the natural analogue of the Wikipedia page in this new world. The homepage provides a bird’s-eye view of a topic: it presents relevant Web pages (organized into useful buckets), videos, images, community, and widgets in a layout that makes it easy for anyone to explore a topic.
Let’s look at the Kosmix topic page for Wrist Pain. On the top is a Wrist Pain Guide – authoritative factual medical data, licensed from A.D.A.M. The page also includes videos, images, community (Yahoo Answers, RightHealth Forums), Web pages organized into useful buckets (Trusted Sources, Advanced Reading), and an Explore section that leads to semantically related topics – symptoms, diseases, alternative therapies, drugs, and medical procedures. There’s even a nifty Body Location Disease Search widget from VisualDxHealth. .
In other words, the Topic Homepage is a giant smart mashup of content from lots of different sources presented in a graphical, user-intuitive manner. The technical challenge in creating a topic homepage is very different from Web search. In search, the problem is an abundance of data on any keyword, so the real challenge is in ranking. By contrast, our real challenges are in bucketing information into useful “bins,” as well as surfacing modules where we don’t have a lot of text to work with such as widgets and other tools.
For example, let’s look at the Topic Page for Atrial Fibrillation. This page includes two widgets: one for Target Heart Rate and one for BMI. How do we determine that these widgets are relevant to Atrial Fibrillation? The secret sauce is our categorization algorithms.. We match topics with modules in the “category space” so we can surface semantically related information. Atrial Fibrillation and Target Heart Rate are related because both are related to Heart Disease. Since our categorization algorithms are completely automated and don’t require human intervention, they enable us to scale as Web content expands.
In order to prove the value of the Topic Homepage idea, as well as validate our algorithms, we launched our flagship site, RightHealth, which is a collection of homepages for health topics. RightHealth has been a huge success, attracting over 2.5 million unique visitors and servicing close to 10 million search queries a month. We are following up with the beta launch of RightTrips and RightAutos, focusing on the travel and automotive categories.
Where do we go next? We are working furiously on taking the technology behind topic home pages and making it completely horizontal, so that we can create unofficial homepages for every topic, from Ahmandinejad to Zanzibar. The challenge is to create a page that puts a lot of information at people’s fingertips without being overwhelming. Not an easy problem, but one we believe is solvable and worth solving.











October 4th, 2007 at 5:45 pm
[...] Creating the Unofficial Homepage for every Topic » This Summary is from an article posted at Alt Search Engines on Thursday, October 04, 2007 [...]
October 16th, 2007 at 11:44 pm
[...] Read the rest of this great post here [...]