• News and feature updates
  • Tutorials and Tips and tricks
  • General discussions and opinions

The hidden cost of fuzzy matching

January 27, 2010 |  by  |  General Info

It’s been frustrating me for a while now that the matching debate hasn’t moved on from fuzzy matching (guesswork).

The problem as I see it is that fuzzy matching, by its very nature, is an imperfect science, and the more sophisticated (complex) it gets the more it increases the need to manually check the search results.  This is very time consuming and adds a lot of unseen expense to the cost of matching and cleaning data.

Academics around the world are looking for a technical solution to a problem that the human brain can manage easily.  The answer for me is simple – human logic.  Human logic is not programmed, it’s not invented, it’s not technical – it’s a simple knowledge transfer.

I would rather see the industry as a whole move towards an intelligent approach, and focus on intelligence led matching.

What do I mean?

Well let’s take company names as an example.  For my money, fuzzy matching adds little value.

What’s more important is being able to determine that GSK is an acronym for GlaxoSmithKline, that BBC stands for British Broadcasting Corporation and that UPS is United Parcel Service. These are just a few examples, but they are prevalent throughout the business world.

It’s not just company names; what about people’s names?  The problem still exists: Dick is short for Richard, Larry is a nickname for Lawrence,  Bob is short for Robert. Bill often stands for William, Liz for Elisabeth, Millie for Amelia – I could go on…

It’s amazing that one of the cornerstones of fuzzy logic is phonetics, most commonly Soundex and various iterations on the theme.   Believe it or not, Soundex was actually invented back in the early 1900s (http://en.wikipedia.org/wiki/Soundex).  We are now well into the 21st Century!  Let’s start leveraging the technology of today, not re-inventing the past.

I will post more on this in the future, but feel free to ping me with your thoughts.

 

Leave a Reply

copyright ©2008-2012 Match2Lists Ltd