Solving the Klotski puzzle in Scala

https://upload.wikimedia.org/wikipedia/commons/8/84/Hakoiri3.jpg

Klotski puzzle, Wikimedia

A very good description of the Klotski puzzle is available on programming website Hackerrank. A brief description is given here for convenience’s sake.

What is Klotski

Klotski is a puzzle consisting in a rectangular board containing labelled blocks. These blocks can slide in the four directions up/down/left/right, but they can neither overlap, nor exit the board.

One of these blocks is called the target. Objective is to find the shortest sequence of moves (path) allowing to place the target block in a given location. Any equally short sequence is deemed equivalent.

In counting the moves, subsequent slides of the same block are all counted as one move.

Why is Klotski a challenging programming problem?

Solving the problem requires the ability to reduce unnecessary intermediate configurations which would otherwise lead to a fast explosion of cases to be examined. Although the problem remains intrinsically hard, it is possible to implement optimised algorithms able to solve puzzles requiring n < 200 moves. I tried to find mathematically rigorous proofs of the complexity of known Klotski solver algorithms, but I could not find one. Or at least, none was proved.

The required optimisations to tackle puzzles of the complexity explained above are a very good learning opportunity for programmers who want to learn algorithms, functional programming, recursion, hashing, and many more skills. I have chosen Scala to solve the problem because it allows to write code without side effects, which is particularly suited to teaching good programming practices.

Let’s start from the data structures

I will introduce some basic terms. The picture below represents a Klotski board with five blocks (A, B, C, D, and E). Empty cells are denoted with “.” characters.

Labels need not be single letters, they can be arbitrary strings. If labels have length L > 1, empty cells are denoted by sequences of L “.” characters. For example, the board depicted below is equivalent (isomorphic) to the board above.

We will start from the representation of a Cell.

 case class Cell(y: Int, x: Int) {
    def applyOffset(o: Offset) = Cell(y + o.dy, x + o.dx)
    override def toString = "(%d,%d)".format(y, x)
  }

Cell is a case class and this allows us to write expressions like Cell(2, 3) without having to explicitly invoke a constructor. I have overridden the “toString” method bacause I want an object like Cell(2, 3) to be rendered as “(2, 3)” instead of the default “Cell(2, 3)”. Additionally, I have defined method “applyOffset” which not surprisingly applies an offset to a Cell.

Remark

Please note that Scala is (also) a functional language and the philosophy is to have immutable objects. Therefore, applying an offset to a Cell creates a new Cell, leaving the original one unchanged.

Similarly, we define case class Offset.

  case class Offset(dy: Int, dx: Int) {
    def invert = Offset(-dy, -dx)
  }

This is a trivial class which I will not comment further. A slightly more interesting class is Board.

  case class Block(symbol: String, cells: Set[Cell]) {
    private def minmaxXY = {
      val xx = cells.map { case Cell(_, x) => x }
      val yy = cells.map { case Cell(y, _) => y }
      (xx.min, xx.max, yy.min, yy.max)
    }

    val (minX, maxX, minY, maxY) = minmaxXY
    val uL = Cell(minY, minX)
    val bR = Cell(maxY, maxX)
    val width = maxX - minX + 1
    val height = maxY - minY + 1
  }

A Block is modelled as a set of cells, labelled with a symbol.

Method “minmaxXY” extracts the maximum and minimum X and Y coordinates from the cells. Val xx is built by extracting only the x coordinate from each Cell. The same applies to val yy, mutatis mutandis. An important derived attribute, which we will use to denote the position of a block, is its upper left corner, “uL”.

Remark

It is important to notice that the “uL” cell need not be inside the block itself. For example, let us consider L-shaped block A below. Its upper left cell is Cell(0, 0) which does not belong to the block Block(“A”, Set(Cell(0, 1), Cell(1, 1), Cell(2, 0), Cell(2, 1))

  0 1 2 3 
0 . A . .
1 . A . .
2 A A . .
3 . . . .

We will now model the concept of a move: when a block slides from a certain position to another position, we will log these bits of information:

Symbol
Upper left cell of the starting position
Upper left cell of the ending position

Let us make an example. In the board below we will slide block A:

  0 1 2 3        0 1 2 3
0 . A . .      0 . . . A
1 . A . .  =>  1 . . . A
2 A A . .      2 . . A A
3 . . . .      3 . . . .

The move will be represented with Move(“A”, Cell(0, 0), Cell(0, 2).

case class Move(block: String, from: Cell, to: Cell) {
    def offset = Offset(to.y - from.y, to.x - from.x)
  }

Case class Move has method “offset” which allows to derive the offset applied to Cell “from” to get to Cell “to”.

Building a Board

A board is given in input as a text. The format is illustrated with an example:

5 4
. A B B
. A B B
A A C .
D D C .
D . . E

On the first row we have the number of rows and the number of columns (5 and 4 respectively). Every cell is represented by its symbol and cells are spaced with a blank. We are now ready to define case class Board.

case class Board(blocks: Map[String, Block], block:String, maxX: Int, maxY: Int) {
    val minX = 0
    val minY = 0
    val symbols = blocks.keys
    ...
}

A Board is a map of symbols pointing to their respective block. The special block is saved in property “block”. maxY and maxX have the obvious meaning, but remember that the count starts from 0. In order to create a Board starting from the text input we will create a builder method in the companion object. Remember that in Scala every class can have a companion object with the same name, which is a singleton. See Scala’s documentation on http://www.scala-lang.org for more details.

  object Board {
    def build(b: Vector[Vector[String]], block:String) = {
      val maxY = b.size - 1
      val maxX = b(0).size - 1
      val eb = "." * b(0)(0).length //empty block
      val flatb = for { i <- 0 to maxY
                        j <- 0 to maxX
                        if b(i)(j) != eb
      } yield (b(i)(j), Cell(i,j))

      Board(flatb.groupBy(_._1).map {
        case (y, zz) => (y, Block(y, zz.map(_._2).toSet))
      }, block, maxX, maxY)
    }
  }

“maxY” is set to the number of rows minus 1 (0-referenced). Likewise, “maxX” is set to the number of columns minus 1. Constant “eb” is built by concatenating as many times the “.” character as the length of the symbols in this particular board. The “for” loop goes through the Vector and for every Cell with a symbol different than the empty cell, yields a tuple containing the symbol and the Cell where the symbol was found. The result of this scan is saved in value “flatb”. If we take our example above (copied here for convenience’s sake), this will be the result:

. A B B
. A B B
A A C .
D D C .
D . . E

val flatb = Seq[("A", Cell(0, 1)), ("B", Cell(0, 2)), Cell("B", Cell(0, 3)),
                ("A", Cell(1, 1)), ("B", Cell(1, 2)), Cell("B", Cell(1, 3)),
                ("A", Cell(2, 0)), ("A", Cell(2, 1)), Cell("C", Cell(2, 2)),
                ("D", Cell(3, 0)), ("D", Cell(3, 1)), Cell("C", Cell(3, 2)),
                ("D", Cell(4, 0)), ("E", Cell(4, 3))]

What do we need to do to get from the above to the format required to build a block? Function “groupBy” is the answer. See how it works.

flatb.groupBy(_._1).map {
  case (y, zz) => (y, Block(y, zz.map(_._2).toSet))
}

Let’s do it step by step.

flatb.groupBy(_._1)= Map[("A", IndexedSeq[("A", Cell(0, 1)), 
                                          ("A", Cell(1, 1)), 
                                          ("A", Cell(2, 0)), 
                                          ("A", Cell(2, 1))]),
                         ("B", IndexedSeq[("B", Cell(0, 2)), 
                                          ("B", Cell(0, 3)), 
                                          ("B", Cell(1, 2)), 
                                          ("B", Cell(1, 3))]),
                         ("C", IndexedSeq[("C", Cell(2, 2)), 
                                          ("C", Cell(3, 2))]),
                         ("D", IndexedSeq[("D", Cell(3, 0)), 
                                          ("D", Cell(3, 1)), 
                                          ("D", Cell(4, 0))]),
                         ("E", IndexedSeq[("E", Cell(4, 3))])

The subsequent invocations of the “map” method do the following:

Create a Block based on each symbol and the sequence of Cells of which the block is composed
Convert sequences of Cells into sets of Cells

Performing a move

Performing a move means to slide a block in one of the four admissible directions up/down/left/right. However, some moves are invalid:

Moves taking a block out of the board
Moves making a block overlap with other blocks

We will now implement the “perform” method which will demonstrate how efficient is the chosen data structure for the definition of the required logic.

 case class Board(blocks: Map[String, Block], block:String, maxX: Int, maxY: Int) {
    val minX = 0
    val minY = 0
    val symbols = blocks.keys

    def perform(s: String, o: Offset) = {
      val b = blocks(s)
      val cells = b.cells.map(c => c.applyOffset(o))
      val outOfBound = cells.find { case Cell(y, x) => y < 0 || y > maxY || x < 0 || x > maxX }

      outOfBound match {
        case Some(_) => None
        case None => val otherBlocks = blocks - s
          val overlap = otherBlocks.values.find(_.cells.intersect(cells) != Set.empty)
          overlap match {
            case Some(_) => None
            case None => Some(Board(blocks + (s -> Block(s, cells)), block, maxX, maxY))
          }
      }
    }
  (...)
 }

Given a Board b, performing a move requires the specification of which block we want to move and in which direction. The block is denoted by its symbol. The direction of the move is denoted by the equivalent offset. The four cardinal directions are denoted by the following offsets:

One move to the left  => Offset( 0, -1)
One move to the right => Offset( 0,  1)
One move upwards      => Offset(-1,  0)
One move downwards    => Offset( 1,  0)

Lets us now consider our example again. Let “brd” be the board depicted below.

. A B B
. A B B
A A C .
D D C .
D . . E

brd.perform(“C”, Offset(0, 1))

will generate this new board:

. A B B
. A B B
A A . C
D D . C
D . . E

Line 8 of code snipped above applies the offset to every cell belonging to the moving block. Line 9 checks whether or not cells exist which after the move go out of the board. In the affirmative case, the result of the move is Option.None (if you do not know what is Option in Scala, please refer to http://www.scala-lang.org).

The second and last check is whether or not the cells of the moved block overlap with the cells of any other block. Please note that the overlap check is elegantly implemented with the Set.intersect method. If the move succeeds, the perform method returns Some(Block).

For the moment we will leave the implementation of class Board. Later in this paper I will come back to it because we will see a magic one-liner which will make the difference between failing and succeeding the demanding performance requirements tested by http://www.hackerrank.com. This single one liner has costed me sleepless nights 🙂

Resist the temptation to look ahead, we have a lot more funny stuff before then.

Laying the ground for the algorithmic approach

I will use this paper to teach you how to write recursive code in Scala, which is an integral part of the functional programming approach. Please note that although recursion has a bad reputation in imperative languages because it consumes precious resources like the stack, the Scala compiler can generate very efficient code, virtually as efficient as a while cycle, on condition that recursion is “tail recursion”, that is, the statement invoking the same method is the very last statement of the method. If you learn to think in terms of recursion, you will be able to write clean code without side effects which uses primarily unmodifiable data. One of the key data structures every Scala programmer must master is List. We will now show how to use lists as a foundational block in our functional approach to solving Klotski. But before then, we need to model the internal states of the Klotski solver. We will now consider the following example.

3 3
A . B
C . B
C D .
A
2 2

The example above has the same syntax we have already described, except for the last two rows, which specify the target: move block “A” to position Cell(2, 2) with the lowest number of moves.

For each configuration our algorithm will do the following:

For every symbol, move it around in every possible direction until no more valid moves are possible, avoiding duplicate states (if a move gets to a board which was already generated, the move is discarded)
When moving the special block, check if the board is a solution to the problem
Add every new board to the history
Add all new board generated during this step to the set of boards which have to be explored at the next iteration.

Proposition

The first board generated which is a solution to the puzzle, is the shortest path to the solution.

Since we are exploring the space of solutions stepwise (every step increases the number of moves of one unit), the proposition is trivially true.

Remark

Above we observed that at every step we must add the new boards generated to the set of boards which have to be explored at the next iteration. However, this is not enough. We have a gap in our data model in that, in order to describe the solution, we also need to save the path to each board (the sequence of moves which brings us from the initial board to the desired state in the minimum number of steps. We will call this object a Trail and we will define it as a Board and the Path which has generated it starting from the initial position. Before giving the details, I will show an excerpt of the internal states of our solver. Numbered boards are our “history”. At each iteration, for every symbol all possible moves are tried, avoiding duplicates. This will proceed recursively, as we will see.

1 Move
                                          
A . B       Move(A,(0,0),(0,1))    . A B
C . B  (0) --------------------->  C . B  (1)
C D .                              C D .

A . B       Move(A,(0,0),(1,1))    . . B
C . B  (0) --------------------->  C A B  (2)
C D .                              C D .

A . B       Move(B,(0,2),(0,1))    A B .
C . B  (0) --------------------->  C B .  (3)
C D .                              C D .

A . B       Move(B,(0,2),(1,2))    A . .
C . B  (0) --------------------->  C . B  (4)
C D .                              C D B

A . B       Move(D,(2,1),(1,1))    A . B
C . B  (0) --------------------->  C D B  (5)
C D .                              C . .

A . B       Move(D,(2,1),(0,1))    A D B
C . B  (0) --------------------->  C . B  (6)
C D .                              C . .

A . B       Move(D,(2,1),(2,2))    A . B
C . B  (0) --------------------->  C . B  (7)
C D .                              C . D

2 Moves

. A B       Move(B,(0,2),(1,2))    . A .
C . B  (1) --------------------->  C . B  (8)
C D .                              C D B

. A B       Move(C,(1,0),(0,0))    C A B
C . B  (1) --------------------->  C . B  (9)
C D .                              . D .

. A B       Move(D,(2,1),(1,1))    . A B
C . B  (1) --------------------->  C D B  (10)
C D .                              C . .

. A B       Move(B,(2,1),(2,2))    . A B
C . B  (1) --------------------->  C . B  (11)
C D .                              C . D

(..............................................)

Solution:

A . B   Move(A,(0,0),(1,1))   . . B
C . B --------------------->  C A B
C D .                         C D .

. . B   Move(C,(1,0),(0,0))   C . B
C A B --------------------->  C A B
C D .                         . D .

C . B   Move(D,(2,1),(2,0))   C . B
C A B --------------------->  C A B
. D .                         D . .

C . B   Move(A,(1,1),(2,2))   C . B
C A B --------------------->  C . B
D . .                         D . A

Let us now define the Trail and other useful types.

    case class Trail(board: Board, path: Path)
    type Trails = List[Trail]
    type Offsets = List[Offset]

The Klotski solver algorithm

Based on what we have seen above, the solver is a breadth-first visit of the solution space which explores each individual board generating all the admissible moves symbol by symbol, discarding duplicate states. We can visualize it as a tree whose nodes are Trails (a Board with its Path).

This is the tree of internal states of the solver. Every level down is one more move in the path.

A very handy representation of a breadth-first approach is a list where every subsequent Trail has path with length equal or greater than the one before. In our recursive algorithm it will be built by appending new Trails to the existing list of Boards to be explored.

This is a list representation of the breadth-first visit of the tree. Lists come handy in Functional Programming because they are recursive data structures.

Time has come to read some code.

  def solve(brd:Board, block:String, target:Cell):Path = {
    val symbols = (brd.symbols.toSet - block).toList
    val hist = scala.collection.mutable.Set[Int]()
    val offsets = List(Offset(-1, 0), Offset(1, 0), Offset(0, -1), Offset(0, 1))
    case class Trail(board: Board, path: Path)
    type Trails = List[Trail]
    type Offsets = List[Offset]
 
    (...)

    def _solve(trails: Trails): Path = trails match {
      case t :: tx => val (_trails, _path) =  if (t.path == Nil || 
        t.path.head.block != block) pursueTarget(List(t))
      else (Nil, Nil)
        if (_path != Nil) _path
        else {
          val _symbols =  if (t.path == Nil) symbols
          else (symbols.toSet - t.path.head.block).toList
          val newTrails = checkOneTrail(t, _symbols)
          _solve(tx ++ newTrails ++ _trails)
        }
      case Nil => Nil
    }


    if (isSolved(block, target, brd)) Nil
    else {
      hist.add(brd.hashCode)
      val trails = List(Trail(brd, Nil))
      if (!symbols.isEmpty) _solve(trails) else pursueTarget(trails)._2
    }
  }

The algorithm is driven by the “solve” method, which is invoked passing the Board, the special block symbol, and the target Cell. The “solve” method checks if by chance the initial Board is already a solution (border case). Otherwise it adds the “hashCode” of the Board to the “hist” Set.

The history of explored Boards

I initially implemented the “hist” Set using a more “functional” immutable Set passed as a parameter to “_solve” and all other methods. In this case, however, given the fast increasing size I then consciously decided to compromise on the purity of the implementation to optimise the performance. Since the “hist” Set is only visible inside method “solve”, the risk of undesired side-effects is well controlled. This is an example of how first principles can be consciously traded off if one knows what he/her is doing.

But this is not all. A second design decision is to historise hash codes of the Boards, instead of the Boards themselves. The rationale is again performance (www.hackerrank.com requirements in this regard are strict). It is a lot more efficient to compare integers than to compare entire boards and their blocks, symbols and cells. Of course, the probability of a clash is small when using good hashing algorithms, but its not null (for the very nature of hashes, which generate values in a constrained range from unconstrained inputs). This is the trade-off. It works in practice, but it is not a 100% mathematically sound approach. So, what would happen in case of a clash? A board would be “found” in the history, which was never before really generated and in the worst scenario this could lead to a sub-optimal solution. This never occurred in any of the 26+ http://www.hackerrank.com test cases or even in my own ones.

The hash function

This function is a very critical part of the algorithm: getting it wrong screws up everything else (I am saying it for a reason). So, what’s special about the hash function of a Board and why does not the default “hashCode” function work well enough? The default hashCode implementation in Scala is actually pretty good. But when we run the algorithm using the default implementation we have an exponential explosion of states. I will now illustrate why and how to fix it. Let us consider the following Klotski puzzle.

5 4
A B C .
A D E .
F F G G
F . G .
H I L L
A
3 3

The following Boards all have distinct hashCodes, because they are different. But, are they any different in terms of solving the puzzle? No, they are not, because they have the same exact morphology the only distinction being that same-shaped blocks have different labels, which is not a differentiator for the search of the optimal path.

A B C .   A C B .   A B C .   A B C .   A B C .
A D E .   A D E .   A E D .   A D E .   A D E .
F F G G   F F G G   F F G G   G G F F   F F G G
F . G .   F . G .   F . G .   G . F .   F . G .
H I L L   H I L L   H I L L   H I L L   I H L L

I could list many more equivalent permutations of this Board, but by now you should get the point. So, our hash function will need to generate the same code for all the equivalent Boards, so that they are not unnecessarily explored slowing down the search to unacceptable levels. Or, if you prefer, so that the space of possible solutions is limited to truly relevant configurations. This is how the correct “hashCode” method looks like.

override def hashCode = (blocks(block).hashCode :: (blocks - block).values.map (b => b.cells.hashCode).toList.sorted).hashCode

The equivalent blocks are made to generate the same hash by removing their symbol from the computation. The symbol of the special block is instead maintained because where the special block is located is a differentiating factor. This hashCode function induces an equivalence relation on the set of generated Boards. Reducing the search to the set of equivalence classes makes the complexity of the search manageable with problems of the size defined above (but remains inefficient for problems of arbitrary size).

Coming back to the algorithm, at the core of the “solve” method there is this logic:

A trail is explored to see if it is possible to move the special block to the target cell. In the affirmative case the algorithm stops and returns the path.
Otherwise, the new configurations are added to the history and are appended to the set of Boards which have to be further explored.
The trail is processed by scanning all possible moves of all symbols in order to prepare the boards to be explored at the next iteration. These new configurations are appended to the list of trails to be explored. This sequence is guaranteed to produce a list equivalent to a breadth-first tree traversal (see above).
Some further optimisations are applied. For example, if the last move in the path of a given trail involved symbol s, the algorithm will not try to move block s because all possible moves were already tried in the step before.

The below is the implementation of all the other recursive parts of the algorithm, which should be self-explanatory after all the clarifications above.

  def solve(brd:Board, block:String, target:Cell, maxmoves: Int = 250):Path = {
    val symbols = (brd.symbols.toSet - block).toList

    val offsets = List(Offset(-1, 0), Offset(1, 0), Offset(0, -1), Offset(0, 1))
    case class Trail(board: Board, path: Path)
    type Trails = List[Trail]
    type Offsets = List[Offset]


    def moveBlockAroundOnce(trail: Trail, s: String, offsets: Offsets, hist: Set[Board]=Set.empty, acc: Trails = Nil): Trails = offsets match {
      case o :: ox => trail.board.perform(s, o) match {
        case Some(newBoard) => if (hist.contains(newBoard))
          moveBlockAroundOnce(trail, s, ox, hist, acc)
        else
          moveBlockAroundOnce(trail, s, ox, hist+newBoard, Trail(newBoard,
            Move(s, trail.board.blocks(s).uL, newBoard.blocks(s).uL) :: trail.path)::acc)

        case None => moveBlockAroundOnce(trail, s, ox, hist, acc)
      }
      case Nil => acc
    }

    def moveBlockAround(trails: Trails, s: String, hist: Set[Board]=Set.empty, acc: Trails = Nil): Trails = trails match {
      case t :: tx => val _trails = moveBlockAroundOnce(t, s, offsets, hist + t.board)
        moveBlockAround(tx ++ _trails, s, hist.union(_trails.map(t=>t.board).toSet), acc ++ _trails)
      case Nil => acc
    }

    def sanitize(trails:Trails, acc: Trails = Nil): Trails = trails match {
      case t::tx => val hashcd = t.board.hashCode
        if (hist.contains(hashcd)) sanitize(tx, acc)
        else {
          hist.add(hashcd)
          sanitize(tx, t::acc)
        }
      case Nil => acc
    }

    def checkOneTrail(trail: Trail, symbols: List[String], acc: Trails = Nil): Trails = symbols match {
      case s :: sx => val _trails = moveBlockAround(List(trail), s)
        val newTrails = sanitize(_trails)
        checkOneTrail(trail, sx, acc ++ newTrails)
      case Nil => acc
    }

    def pursueTarget(trails: Trails, acc: List[Trail] = Nil): (Trails, Path) = trails match {
      case t :: tx => if (isSolved(block, target, t.board)) (trails, t.path)
      else {
        val _trails = moveBlockAroundOnce(t, block, offsets)
        val newTrails = sanitize(_trails)
        pursueTarget(tx ++ newTrails, acc ++ newTrails)
      }
      case Nil => (acc, Nil)
    }

    def _solve(trails: Trails): Path = trails match {
      case t :: tx => val (_trails, _path) =  if (t.path == Nil || t.path.head.block != block) pursueTarget(List(t))
      else (Nil, Nil)
        if (_path != Nil) _path
        else {
          val _symbols =  if (t.path == Nil) symbols
          else (symbols.toSet - t.path.head.block).toList
          val newTrails = checkOneTrail(t, _symbols)
          _solve(tx ++ newTrails ++ _trails)
        }
      case Nil => Nil
    }


    if (isSolved(block, target, brd)) Nil
    else {
      hist.add(brd.hashCode)
      val trails = List(Trail(brd, Nil))
      if (!symbols.isEmpty) _solve(trails) else pursueTarget(trails)._2
    }
  }

The “history” used in the inner methods is not optimised with hashCodes. I wanted to demonstrate an alternative approach which does not affect the overall performance because it is applied in the context of a very limited and defined part of the algorithm. Still, it is easy to get it wrong (actually, I did, initially) if the filter is applied too early inside the inner recursive methods. The reason is that while we certainly want to discard equivalent boards, we still need to let the algorithm use them to find the set of all possible moves for a given symbol because otherwise we may prematurely cut a path leading to valid boards.

There are of course some more details which you can find in my complete implementation of the Klotski solver in github. Some further optimisations and beautifications are possible, but this implementation passed all the demanding tests in http://www.hackerrank.com. By the way, this paper and the accompanying code is provided as is, with the intent to teach programming techniques. It is explicitly forbidden to misuse it, for example, by cheating in programming challenges or gain undeserved credits.

That said, I hope you enjoyed this paper and I wish you a lot of fun with Scala programming. Up Scala!

Alaraph

Insights and reflections on complex challenges

Solving the Klotski puzzle in Scala

Leave a comment Cancel reply

Share this:

Related

Leave a comment Cancel reply