Powerset, Natural Language Search Engine

"I think there's a ton of challenges, because in my view, search is in its infancy, and we're just getting started. I think the most pressing, immediate need as far as the search interface is to break paradigm of the expectation of "You give us a keyword, and we give you 10 URL's". I think we need to get into richer, more diverse ways you're able to express their query, be it though natural language, or voice, or even contextually. I'm always intrigued by what the Google desktop sidebar is doing, by looking at your context, or what Gmail does, where by looking at your context, it actually produces relevant webpages, ads and things like that. So essentially, a context based search."
(Marissa Mayer, VP at Google)

New York Times reports that Powerset, a start-up focused on search, licensed natural language technology from the famous Palo Alto Research Center (PARC). Its purpose: "build a search engine that could some day rival Google".

Unlike keyboard-based search engines like Google, Powerset wants to let users type questions in natural language, by developing a system which recognizes and represents the implicit and the explicit meaning in a text.

The problem is that even if Powerset has great algorithms for understanding the meaning of a query (and there aren't fool-proof algorithms for that), building a search engine requires a huge infrastructure and processing power. Fernando Pereira, an expert in natural language from the University of Pennsylvania, even questions if PARC's NLP technology is a good approach for search: "The question of whether this technology is adequate to any application, whether search or anything else, is an empirical question that has to be tested".

Besides, Google's own approaches for delivering answers show that it's hard to give a single relevant answer for most queries, which are by default ambiguous. Google is rather inclined to use its huge corpus and apply statistical algorithms instead of using grammar rules. Peter Norvig, director of research at Google, says: "I have always believed (well, at least for the past 15 years) that the way to get better understanding of text is through statistics rather than through hand-crafted grammars and lexicons. The statistical approach is cheaper, faster, more robust, easier to internationalize, and so far more effective." Google uses statistics for machine translation, question answering, spell checking and more.

People tend to be lazy and type queries that contain an average of 2-3 words - that wouldn't help too much a natural language search engine, so it would ask more in-depth questions about your query. For a lot of queries (e.g.: navigational queries, like "snap"), you'll spend more time refining the ambiguous query. Google tries to balance the top results, and the most important pages are first.

Powerset might be launched at the end of the year. Hakia, another search engine that uses NLP, is already available, but its results don't look promising.

Labels

Web Search Gmail Google Docs Mobile YouTube Google Maps Google Chrome User interface Tips iGoogle Social Google Reader Traffic Making Devices cpp programming Ads Image Search Google Calendar tips dan trik Google Video Google Translate web programming Picasa Web Albums Blogger Google News Google Earth Yahoo Android Google Talk Google Plus Greasemonkey Security software download info Firefox extensions Google Toolbar Software OneBox Google Apps Google Suggest SEO Traffic tips Book Search API Acquisitions InOut Visualization Web Design Method for Getting Ultimate Traffic Webmasters Google Desktop How to Blogging Music Nostalgia orkut Google Chrome OS Google Contacts Google Notebook SQL programming Google Local Make Money Windows Live GDrive Google Gears April Fools Day Google Analytics Google Co-op visual basic Knowledge java programming Google Checkout Google Instant Google Bookmarks Google Phone Google Trends Web History mp3 download Easter Egg Google Profiles Blog Search Google Buzz Google Services Site Map for Ur Site game download games trick Google Pack Spam cerita hidup Picasa Product's Marketing Universal Search FeedBurner Google Groups Month in review Twitter Traffic AJAX Search Google Dictionary Google Sites Google Update Page Creator Game Google Finance Google Goggles Google Music file download Annoyances Froogle Google Base Google Latitude Google Voice Google Wave Google Health Google Scholar PlusBox SearchMash teknologi unik video download windows Facebook Traffic Social Media Marketing Yahoo Pipes Google Play Google Promos Google TV SketchUp WEB Domain WWW World Wide Service chord Improve Adsence Earning jurnalistik sistem operasi AdWords Traffic App Designing Tips and Tricks WEB Hosting linux How to Get Hosting Linux Kernel WEB Errors Writing Content award business communication ubuntu unik