Building Semantic Search One Acquisition at a Time

I have been formulating some thoughts today reading about Google’s huge investment in cloud infrastructure. What are they building? I have my theories, but allow me to expand upon them here. It was learned today that ‘The Goog’ may have had ulterior motives in its recent purchase ITA. Hidden deep within bowels of ITA was a small database outfit out of Boston called Needlebase. Their expertise to quote the site, “a revolutionary platform for acquiring, integrating, cleansing, analyzing and publishing data on the web”. Without the marketing double speak they are in the data organizing business and they provide the tools to make ordered data out of chaos. As the Giga Om story by Liz Gannes describes it as a means or method of giving structure to disorganized and constantly changing data sets based about any topic, think the real-time web indexing. Semantic search is the holy grail of machine learning when it comes to information processing and categorization. For those not too familiar with Semantic data on the web think of it as meta data that gives discrete pieces of information meaning even relationships to one another. Having this power make agents of information processing more intelligent in how it handles the task of parsing and traversing links in data.  Search engines are much smarter now then they have ever been, but implementing the a semantic layer has proven difficult. And just as the likes of Yahoo, Google and Microsoft starting figuring it out the web evolved into a real-time social mess that was increasingly harder to crawl. Add to this the gated garden of its biggest competitor Facebook (and similar data silos) and you get a sense that Mountain View sees a growing threat to its search hegemony.

This explains it purchase of MetaWeb whose business is semantic tagging and organizing data. It also doesn’t hurt that they have quite a huge database of  approximately 12 million Topics or Entities. Expect this to expand exponentially as MetaWeb and Needlebase join forces to tackle this problem with Google’s engine. And what ever happen to Aardvark? Their Quora-like natural language Q & A engine that was acquired in February.  Wouldn’t this be part of their natural language search strategy? You have to read these guys approach to social search. It is mind blowing what they propose.  This proposal was inspired by the Google Founders revolutionary paper the marked the binging of the link economy.  They wish to do the same thing with people not web spiders. In all the hoopla about what Google must do to compete in social we sometimes forget (and I do too) that they are a search company. The fact that they grace us with all these cool tools (I am typing this is Google docs)  is a direct result of search and the advertising model it enables. These are all good moves by Google. They need to get better at what is the core competency of the company, search. Fusing all these things together; the vertical search play with ITA, MetaWeb, Aadvark and increased investment in cloud infrastructure will make for a compelling evolution to their search product.

Leave a Reply

Your email address will not be published. Required fields are marked *