Posted on

Yahoo BOSS vs Lucene

When I worked at Yahoo! we got to use BOSS for the search functionality for that particular property.  BOSS stands for “Build Your Own Search Service.” So far it seems to be mainly used by SEO people, and few companies interested in leveraging the data in Yahoo’s/Microsoft’s database.  For example as a discovery method for existing content, which they need to apply their own search algorithms to.

I was charged with the task of leveraging an existing web search service for which we could control the index.  Basically our own search engine.  Now when I was at Yahoo! that’s exactly what BOSS allowed us to do. We had a special interface into which we can add indexes and control weights/metrics on different fields to prioritize search criteria.

Unfortunately I don’t have that luxury anymore, as the public offering of BOSS seems to do Web, News, Images, Ads, and Spelling mostly.

So I am setting up a demo using Lucene. Yet another wonderful open source contribution from the amazing team over at Apache Software Foundation. So I just downloaded it now, and I am about to start playing with it.

Hopefully I will be fine with Yahoo BOSS or Lucene/Solr. I am asking the Yahoo people for access to that interface. I don’t think they will provide it, but Lucene should be just as good. We will just need to maintain some infrastructure for it.

There is also Google Search Appliance, but you gotta get on the phone just to try it and I really don’t want to call Google just to try them out also.