Posts tagged collect
Analyzing Match Scores in ‘fuzzy searches’ – explaining the shades of gray in a non-binary world
Sep 20th
In a previous article – Not all has to be black and white in SQL Queries – , I discuss how not all searches are about perfect matches. The search for your lifelong partner, or your next job are fine examples of searches where a perfect match is hard to obtain. No single person or no single job will score one 100% on all your criteria. Unfortunately, such binary searches is exactly what SQL is good at. WHERE ALL CRITERIA ARE MET is the filter SQL applies to the records inspected. Of course you can use OR operators to leave some room for non-perfect matches and instead of using the equality (=) you can go for between or LIKE. But a structured approach to ‘fuzzy’ searches where the shades of gray most searches inevitably end up in is not readily available.
The prequel to this articles takes a few first steps in illustrating how scores against different criteria can be collected, how different weight factors for search criteria can be taken into account and how the first two search-match-patterns (number range, discrete values) started to emerge. One major question we ended that article with: how can we explain when the search is performed and the match scores for each record is known, how the score was arrived at. What criteria did the record meet – to some extent- and which ones did it fluke completely.
This article demonstrates a slightly different approach to the fuzzy search challenge and shows a way to not only calculate the scores but also construct a detailed analysis of the composition of the score.
Read the rest of this entry »


