Writing a Word Search puzzle solver in SQL

Brace yourself for yet another pretty useless demonstration of the unexpected wonders of SQL as well as my apparent lack of meaningful things to do with my life.

Undoubtedly, you are familiar with Word Search puzzles, the type of puzzle that present a matrix with a mash of letters in which specific words have been hidden. For example:

Writing a Word Search puzzle solver in SQL wordSearch0

This Word Search contains the names of several countries (such as italy, holland, poland, spain, france, japan, togo, peru). If you look closely, you will find them:

Writing a Word Search puzzle solver in SQL wordSearch0Solution

In this article, I will show you – and I know this is something you have been waiting for – how a SQL Query can solve these puzzles for you:

Writing a Word Search puzzle solver in SQL wordSearchSQL0

....

The first in solving our Word Search puzzles is expressing them in a format we can handle in a SQL query. Of course we can store them in database tables, but we can take a short-cut, using a Collection Type and the Table Operator.

We define the type WORD_LIST

create type word_list is table of varchar2(500)/

and use it to instantiate in situ in our query a collection of strings, each representing one row in the Word Search Matrix:

with matrix as( select rownum seq  ,      lower(column_value)  txt  from   table ( word_list( 'FYYHNRD'                          , 'RLJCINU'                          , 'AAWAAHR'                          , 'NTKLPNE'                          , 'CILFSAP'                          , 'EOGOTPN'                          , 'HPOLAND'                          )               ))

The above in-line view is the initial section our Word Search Solving Query. The result of this query is:

Writing a Word Search puzzle solver in SQL

The basic algoritm of the Word Search Solver is like this:

  • iterate over all cells in the matrix (all letters in the diagram)
  • from each cell, start a search in all eight directions: up and down, left and right and four diagonal combinations of up & down and left & right
  • for each all eight searches, check if one of the looked for words is found; if so, report the word, the cell where it starts and the direction in which it was found

The key to implementing this algoritm is an hierarchical query – as we shall see in a moment.

For the matrix in line view demonstrated above, our next step is an in-line view to visit all individual cells (letters) in the matrix. This is relatively simple:

, cells as  ( select distinct         seq, col  ,      substr(txt, col, 1) cell  from   matrix         cross join         ( select rownum col           from   matrix         )  )

Going forward, we have to explore eight directions. Directions are defined by two parameters that specify the movement along the horizontal and the vertical axis. The parameters can be -1, 0 or 1 – meaning [left, no horizontal movement, right] and [up, no vertical movement, down] respectively. Note that they cannot both be 0, as that would mean no exploration at all. These directions are generated in another in-line view in a somewhat contrived way; here is definitely room for improving the query.

, direction as  ( select distinct           sign(rownum - 2) dir    from   ( select count(*)             from   dual             group             by cube (1,2)           )    ), directions as   ( select h.dir hor    ,      v.dir ver    ,      case h.dir           when 1 then 'left'           when -1 then 'right'           end horizontal    ,      case v.dir           when -1           then 'down'           when 1           then 'up'            end vertical    from   direction h           inner join           direction v    on     ( not (h.dir=0 and v.dir=0) )  )

However, it does the job. Note that the directions in-line view returns all eight routes, both expressed in -1/0/1 notation as well as human-readable format.

The final in-line view we use is the one that provides the list of words to search for. In the example, we are looking for the names of countries. We inject these names in the following way into our query:

, words as   ( select column_value word    from   table( word_list('italy', 'holland' , 'poland' ,'france'                           ,'canada', 'japan','india','china','togo','peru'                           )                 )   )

Here of course we use the same ‘trick’ as before of generating rows on the fly by dynamically, inside the query, instantiating a collection and accessing it through the TABLE operator.

Now that all preparations have been made – the inline views that prepared all data used in the query (and note that we do not query any data at all from any database table or view) – we can start looking for answers.

The real heart of the query is this piece:

from   ( select c1.*         ,      hor         ,      ver         ,      horizontal         ,      vertical         from   cells c1                cross join                directions        )        cross join         wordsstart with cell = substr(words.word,1,1)connect by prior col-hor  =  col            and            prior seq-ver = seq            and            prior hor = hor            and            prior ver = ver           and           level < length(words.word)+1           and           cell = substr(words.word, level,1)

Here we join cells with directions – giving us one row to process for each cell for each of the eight directions. Then we join with words, to start an exploration from each cell in each of the eight directions for all of the search words. Alternatively, we could perhaps find all paths from all cells and try to join those paths to the search words. I leave that as an exercise to the reader.. (as lazy authors tend to write in their books).

The heart of the matter is next. We only start exploring from a cell if it contains the first letter of the current search word, why else bother. Then we start an hierarchical query, going from one cell to the next depending on the current direction, as specified by hor and ver (for horizontal and vertical movement). Suppose we are at cell row 4, column 3 and hor and ver are 1 (to the left) and -1 (going down). The hierarchical condition prior col-hor = col states that for the next cell in the hierarchical path its col value should be equal to the prior col value (which is 3) minus hor (which is 1) – so the next cell’s col value should be 2. Following similar reasoning, the next cell’s seq (or row) value should be 4 – (-1) = 5. So the next cell in the hierarchical path is row 5, column 2 – indeed going down and to the left.

The next two conditions ensure that we continue in the same direction from each subsequent cell in the hierarchical path. The last two conditions ensure that we do not keep exploring an obviously fruitless path: if the we have traversed more levels than the search word has characters we can stop the search and if the current cell does not contain the character it should for the current search word, we too can quit.

Adding a thin additional layer on top of this query, in order to filter all rows that did not find a search word, the final Word Search query becomes the following:

with matrix as( select rownum seq  ,      lower(column_value)  txt  from   table ( word_list( 'FYYHNRD'                          , 'RLJCINU'                          , 'AAWAAHR'                          , 'NTKLPNE'                          , 'CILFSAP'                          , 'EOGOTPN'                          , 'HPOLANDI'                          )               )), cells as  ( select distinct         seq, col  ,      substr(txt, col, 1) cell  from   matrix         cross join         ( select rownum col           from   matrix         )  ), direction as  ( select distinct           sign(rownum - 2) dir    from   ( select count(*)             from   dual             group             by cube (1,2)           )    ), directions as   ( select h.dir hor    ,      v.dir ver    ,      case h.dir           when 1 then 'left'           when -1 then 'right'           end horizontal    ,      case v.dir           when -1           then 'down'           when 1           then 'up'            end vertical    from   direction h           inner join           direction v    on     ( not (h.dir=0 and v.dir=0) )  ), words as   ( select column_value word    from   table( word_list('italy', 'holland' , 'poland' ,'france'                           ,'canada', 'japan','india','china','togo','peru'                           )                 )   )select seq + ver * (length(word)-1)  "Starting at row",      col + hor * (length(word)-1)  " and Column ",      horizontal  "Going",      vertical    " ",      cast(word as varchar2(25))        "you will find the word"from   ( select distinct                cell         ,      seq         ,      col         ,      hor         ,      ver         ,      sys_connect_by_path(cell,'*') p         ,      word                  ,      horizontal                  ,      vertical         from   ( select c1.*                  ,      hor                  ,      ver                  ,      horizontal                  ,      vertical                  from   cells c1                         cross join                         directions                )                cross join                 words         start with cell = substr(words.word,1,1)         connect by prior col-hor  =  col                     and                     prior seq-ver = seq                     and                     prior hor = hor                     and                     prior ver = ver                    and                    level < length(words.word)+1                    and                    cell = substr(words.word, level,1)        )where  translate(p,'Q*','Q') = word/

Let’s try a somewhat more challenging puzzle as well:

Writing a Word Search puzzle solver in SQL wordSearch1

The solution – searching for terminology very much part of our daily life at AMIS – looks like this:

Writing a Word Search puzzle solver in SQL wordSearchSolution1

The query outlined above returns with:

Writing a Word Search puzzle solver in SQL wordSearchSolutionSQL1

The correct answer – after four minutes. It is clear that the search time increases rapidly with a larger set of search words and especially a larger matrix.

Resources

Download the queries demonstrated above: wordSearchQueries.txt.

Tags:,

11 Comments

  1. Patrick Barel December 20, 2006
  2. Zeger Hendrikse December 11, 2006
  3. Lucas Jellema December 9, 2006
  4. anton December 9, 2006
  5. anton December 9, 2006
  6. Lucas Jellema December 8, 2006
  7. anton December 8, 2006
  8. Marco Gralike December 8, 2006
  9. Jeff Kemp December 8, 2006
  10. Jasper December 7, 2006
  11. Patrick Sinke December 7, 2006