Functional Solution to Range Minimum Query (RMQ) using Segment Trees in Scala

Segment TreeRMQ

Given an array of N integers, computing the minimum over the whole array is an easy operation requiring N steps. For example, one can take the first value as a tentative minimum, and iterate through the array comparing the tentative minimum with the current value. Any time the current value is smaller than the tentative minimum, it becomes the new tentative minimum, until the end of the array.
Let us now consider the case where it’s not the minimum of the whole array which matters, but the minimum of an arbitrary sub-array of the original one. With the term sub-array I mean an array formed by M consecutive elements taken from the original array, with 1 <= M <= N
The brute-force approach would be to compute the minimum element for any given range. The Range Minimum Problem (RMQ) is about finding a more efficient solution to this problem. How can one do better than O(N)? One obvious way would be to pre-compute the minimum of sub-arrays and sacrifice space for performance. But, wait a moment, how many possible sub-arrays can be defined given an array of size N?
Given an array of size N there are N sub-arrays of size 1, N-1 sub-arrays of size 2, N-2 sub-arrays of size 3 (…), 2 sub-arrays of size N-1 and 1 sub-array of size N.
The total number of sub-arrays is therefore:
Number of sub-arrays with consecutive elements
which is bad, because this is space complexity O(N2). Is there another smarter way to tackle the problem? The answer is, of course, positive. Actually, there are a number of ways documented in the literature. This article will illustrate the technique known as “Segment Trees”. This is the idea that it is not necessary to pre-compute all the possible sub-arrays, but only some of them, and the other ones can be more quickly computed starting with the available partial results. Let us consider an example to clarify the idea. Before I go on, I need to do some explaining regarding the choice of Functional Programming (FP) data structures. Arrays are an imperative data structure which is not suitable for FP proper. The idea is that FP is about immutable values allowing for side-effect free computer programs. Arrays, instead, mimic the way a computer memory is structured, i.e. as a sequence of modifiable cells. The functional equivalent of arrays, in Scala, is Vectors. From now on, I will abide by the FP rules and use Vectors. I will now get back to our example. Given the vector:

val data = Vector(2,5,3,0,8,9,5,3,7,5,9,3)

if we knew already:

val min1 = min(Vector(2,5,3,0,8,9)) // min[0:5]
val min2 = min(Vector(5,3,7,5,9,3)) // min[6:11]

we would not need to scan the sub-arrays, because we could simply compute:

val min = math.min(min1, min2)

Segment Trees are balanced binary trees which are composed by nodes consisting of a range and a value, which is the pre-computed minimum of the sub-array corresponding to such a range. In order to see how this tree looks like, let us consider the following example:

val data = Vector(4,5,2,8,9,0,1,2,5,1,8,6,3) // N=13

Our Segment Tree would look like this (click on image to see it full size):

Example Segment Tree

The first thing to notice is that there are 25 ranges (nodes or leaves) in this tree, which is a lot less than 13(13+1)/2= 91
With higher values for N the difference would be even more noticeable. This number can be approximated in excess with this formula:

Approximation in excess of the number of ranges in a Segment Tree (nodes and leaves)

In our case it would give:

20 + 21 + 22 + 23 + 24 = 1 + 2 + 4 + 8 + 16 = 31

The approximation is in excess because the formula assumes that each node forks in two nodes, which is not always the case. To see how best is this idea than the brute force approach, let us consider a Vector with N=1000 elements. We have seen above that all possible sub-ranges are:

1000*(1000+1)/2 = 500'500

However, our Segment Tree will contain no more than:

1 + 2 + 4 + 8 + 16 + 32 + 64 + 128 + 256 + 512 + 1028 = 2’061

Remarkable achievement, isn’t it? We can now see how to query a Segment Tree in order to get the minimum value of any sub-range of the original vector. The query algorithm is very simple, if expressed recursively, which is the perfect way to write in a Functional Language like Scala. The idea is this: if the query range is the same as the root range, get the minimum value from there. Otherwise, query the sub-trees, each with the intersection of the query range and the respective left and right sub-ranges, recursively. The Scala code will speak for itself. Try it out and have fun! I will copy it here for convenience’s sake, but it is also available at this address:

https://github.com/maumorelli/alaraph/blob/master/hackerrank/src/com/alaraph/hackerrank/rmq/Solution.scala

RMQ-002
RMQ-003

The explanation of how to use the program, the input format and expected output can be found at:
https://www.hackerrank.com/challenges/range-minimum-query
Here is a brief excerpt:

Sample Input

10 5
10 20 30 40 11 22 33 44 15 5
0 5
1 2
8 9
0 9
4 6

Sample Output

10
20
5
5
11

We are now approaching the end of this article. I hope you had a good read. To conclude, I would like to draw your attention to the fact that the depth of the Segment Tree is only Log2(N)+1. This clearly implies that a Segment Tree can be visited very efficiently with a moderate number of recursive calls.

References

Hackerrank, (2014), https://www.hackerrank.com/challenges/range-minimum-query, accessed 01.08.2014

Topcoder, (2014), http://community.topcoder.com/tc?module=Static&d1=tutorials&d2=lowestCommonAncestor, accessed 01.08.2014

Wikipedia, (2014), https://en.wikipedia.org/wiki/Segment_tree, accessed 01.08.2014