Wednesday, March 21, 2012

Project Euler 67

Project Euler 67 is about finding the maximal path through a series of nodes. In this case it is the sum of integers in a triangle, from top to bottom. This is a simpler version of the problem which the Viterbi Algorithm solves. The Viterbi Algorithm is used primarily in HMMs where you want to find a minimal path through K layers of N nodes, and every node can travel to any of the nodes in the following layers. There is an added complication of there being a transition cost, each different node has a cost of going to another node. A small example, with the transition costs being either 0, 1 or 2. Each depending on how far away the previous number is.

1 3 4 5
7 9 1 2
6 3 8 0

So to solve this the realization that only the preceeding nodes matter for any given node. Therefore we can simplify the problem by running one layer at a time and updating their weights such as:


1 3 4 5
7 9 1 2
6 3 8 0


4 = 3 + max(1+0 or 7+1 or 6+2)
11 = 9 + max(1+1 or 7+0 or 6+1)
5 = 3 + max(1+2 or 7+0 or 6+0)
1 4 4 5
7 11 1 2
6 5 8 0

8 = 4 + max(4+0 or 11+1 or 5+2)
etc
1 4 8 5
7 11 6 2
6 5 13 0

etc
1 4 8 12
7 11 6 8
6 5 13 7

So we can see here that 5 is the lowest possible value, and we only had to visit each node once!

The same can be applied to the simpler problem of the triangle. Again an example is provided below, this time finding the maximal path.
3
7 4
2 4 6
8 5 9 3
3
10 7
2 4 6
8 5 9 3
3
10 7
12 14 13
8 5 9 3
3
10 7
12 14 13
20 19 23 16
So here we can see that the maximal path is 23 any we only had to visit each node once. So now we have a plan, lets implement it in python.

First Implementation:
if __name__ == "__main__":
    numbers = []
    with open("triangle.txt") as f:
        for line in f.readlines():
            numbers.append(map(lambda x: int(x),line.split()))
    for i in range(1,len(numbers)):
        for j in range(len(numbers[i])):
            if j == 0: # on the left side of the triangle
                numbers[i][j] += numbers[i-1][j]
            elif j == len(numbers[i]) - 1: # on the right side of the triangle
                numbers[i][j] += numbers[i-1][j-1]
            else:
                numbers[i][j] += max(numbers[i-1][j-1],numbers[i-1][j])
    print max(numbers[-1])
This code runs in ~ 0.05 seconds. Very fast given that the brute force solution would take about 5 billion years.

So the program essentially first creates an array in which we will store all of the numbers, then we get to this like
          numbers.append(map(lambda x: int(x),line.split()))
This may be a bit confusing for people not accustomed to lambda functions or the map function.

Basically line.split(), if line is "12 5 29" will return ["12","5","29"]. So we want to transform each element from a string to an integer. map(function,list) will run function on each element in a list and return the resulting list. lambda here creates a temporary function that takes in a variable x and applies int(x) to it. This is an example of bad code, I have become too accustomed to using lambda functions each time I use map, so lambda functions are not needed in the slightest. An equivalent line of code would be:

          numbers.append(map(int,line.split()))
Going forward hopefully the code should be self explanatory, if not a bit ugly. I don't like to have so many if statements so I while perusing the forums I found a cleaner way to solve this problem.

if __name__ == "__main__":
    numbers = []
    with open("triangle.txt") as f:
        for line in f.readlines():
            numbers.append([0] + [int(x) for x in line.split()] + [0]) 
            #credit goes to ArhturDenture in the fourms for a more pythonic way
    for i in range(1,len(numbers)):
        for j in range(1,len(numbers[i]) - 1):
            numbers[i][j] += max(numbers[i-1][j-1],numbers[i-1][j])
    print max(numbers[-1])



So here I was able to not only pad the array so I no longer have to have the sanity checks, but I removed the lambda function and the map function. The run time for this is essentially cut in half (0.028 sec)

So first of all the triangle will now look something like this:

0 3 0
0 7 4 0
0 2 4 6 0
0 8 5 9 3 0


This will eliminate the need to check to see if we are on either side of the triangle, so long as we don't modify the sides the interior of the triangle will always be greater than zero, or it will not matter. Secondly we were able to both remove the lambda and the map functions by doing it in a more pythonic way. Basically when there is a for loop inside the brackets for a list the for loop will be executed and stored as a list. This will execute a lot faster as it does not need to either create a function because of the lambda function or execute the map function.

No comments:

Post a Comment