If Microsoft is ever to get out of 3rd place in Web search behind Google and Yahoo they are going to have to do something different. One of the cards they plan to play is an investment in improved search technologies and while there were some on display when Windows Live Search launched last week, it looks like there’s another one up their sleeve – faceted search. Paula J. Hane at Information Today:
Last week, Microsoft announced a major upgrade to the new search engine it has been testing since March. It has moved its Windows Live Search and Live.com out of beta status and said that Live Search will power the search capability on MSN, the company’s news and entertainment portal. A new feature is the Related Search function, which is designed to help users refine a query by simply clicking on a list of related terms. The unusually low-key and minimalist press announcement generated little excitement. After some poking around, Information Today, Inc. learned from search expert Stephen E. Arnold that Microsoft has even more potent technology ready to deploy.
Unlike the upgrade to Live.com, which, according to a Microsoft spokesperson, just uses algorithms that mine previously submitted queries to the engine, the new and unannounced search system brings faceted search to a Microsoft application. Try it yourself at http://rwsm.directtaps.net. The Microsoft project, called Search Results Clustering (SRC), currently offers a search beta and downloadable toolbar.
What Microsoft is doing is called text mining. This is jargon for discovering people, places, things, and other facts from text. These facts are then organized so a user can point and click on a category and see the related information. The approach is the secret sauce for such companies as Exalead in Paris and Endeca in Boston.
Arnold, who is the author of Enterprise Search Report, 3rd edition, and the forthcoming Text Mining Report, said: “If Microsoft makes this function part of SharePoint, it will pose a serious threat to companies offering SharePoint-specific search enhancements and be a strong competitive challenge to Google and its Appliance and OneBox API. If Microsoft puts this technology in Live.com, that service will almost certainly see an increase in traffic. Microsoft had to do something, and this Vivisimo-like clustering may be one of Microsoft’s most significant advances yet.”
There’s much more by following the link, but the project is from Microsoft Research Asia’s Search Technology Center which was established in October, 2005 and is apparently yet another Microsoft organization working on search technology.
As for faceted search itself, the basic idea is to not only provide search results for the specific term provided by the user, but also for various “facets” of the specific term. An example would be if the user searched for “boots,” facets might be “fashion,” “western,” or “mens” which further segment the search space. The user then could click on the facet of interest which would have more facets.
The real trick, of course, is to discover meaningful facets for arbitrary search terms and the Microsoft project performs it via on-the-fly cluster analysis of the results of the original search term. Hit the link in the quote above and kick the tires for yourself.