2012-11-09 Full-text searching

In a previous project, I solved the full text search problem as follows.

Management overview: it's very easy to do full text searching but very hard to assign a weight to the results. MySQL supports full-text search but only in a degraded type of database (MyISAM instead of the full-featured InnoDB). The problem can be fixed by creating an "archive" table in this degraded MyISAM format, where you store a business ID, plus a summary of the data you're looking for (i.e. a concatenation of business name, address, reviews, locations etc). The MySQL full-text search facility then becomes available, and you get results as such:
(score, businessID, summary)
(0.99, 12, "Clear Cafe A beautiful restaurant that serves raw, vegan and seafood cuisine www.clearcafeubud.com Restaurants Jl. Hanoman, Ubud, Bali 80571, Indonesia 03618894437 Ubud Cheap and good food Bart I've been there with a 10+ people group and they were happy to accomodate apparently they also blah blah")

As you can see, the summary is just a concatenation of name, address, category, reviews, phone number, everything. There is lots of duplicate information in there, but that doesn't matter for the results.

Tech details: we were doing scientific measurements and these were stored in a bunch of InnoDB tables. You can mix table types InnoDB and MyISAM in a database schema. I created a separate table called 'fulltexts' in the database, with basically two fields: "measurement ID", and "summary". Any human-readable fields from any of the tables were concatenated into a summary. I created a database row-level trigger for a couple of tables, so that each insert or update would concatenate a buch of fields and do an insert into 'fulltexts' (or a replace, if the summary needed updating). Then, this table can be searched by the MySQL facility "full text search", and give you back a result with a relevancy. This is quite nifty and fast.