Friday, March 30, 2012

Project Euler 15

In problem 15 of Project Euler the problem of finding how many paths are possible on a 20x20 grid. To start us off project euler provides a very nice image of all of the possible paths through a 2x2 grid:

Hopefully it should be obvious how many paths there are, there are 6 paths that are possible. Surely there is a nice mathematical way to do this, but for now lets think about this as simply a problem to code your way out of. First off all we need to choose a data representation for this problem. It should be pretty obvious that the number of times the arrow goes 'east' for one square is the same as well as the number of times the arrow goes 'south' for one square. So we can then think about this problem as saying we need to make a certain number of 'east' moves and a certain number of 'south' moves. So when we start we have two choices, we can use a 'south' move or a 'east' move. So each time we still have at least one 'south' move and one 'east' move left we have to go down two different paths, but if we don't have any 'east' moves left there can only be one path to go down, same goes for 'south' moves. So we could write a function in python that looks like this:




def f(downs,lefts):
    if downs > 0 and lefts > 0:
        return f(downs - 1,lefts) + f(downs, lefts - 1)
    else:
        return 1

if __name__ == "__main__":
    print f(20,20)

This will solve the problem, but as the size of the grid gets bigger and bigger the number of times that we have to call the funciton increases expotentially. I am not even sure how long it would take to get the answer. But because this function is deterministic, meaning that it will always return the same output given the same inputs, we can use a techinique called memoization. The general plan for what we will do is look up in a hashmap the input every time we run the function and see if we have already run the function on those inputs, if we have we simply output what we got last time.
def memoize(f):
    cache = {}
    def g(x,y):
        if (x,y) not in cache:
            cache[(x,y)] = f(x,y)
        return cache[(x,y)]
    return g

@memoize
def f(downs,lefts):
    if downs > 0 and lefts > 0:
        return f(downs - 1,lefts) + f(downs, lefts - 1)
    else:
        return 1

if __name__ == "__main__":
    print f(20,20)

This code uses something called decorators. Basically what it does it every time the fuction f is called memoize's g function is called instead. So as you can see the g funtion first checks to see if the input is in a cache and then, if it is not, it runs the function and stores the result in the cache. This makes the program execute in less than 0.005 seconds. This is an amazing speed up given that I tried to run the first program when I started writing this blog post and it has still not finished.

2 comments:

  1. This is counting Young Tableaux --- a beginning into a fascinating topic.

    Interesting that this kind can be computed so quickly with a nice computing trick. It gets much harder in higher dimensions.

    ReplyDelete
  2. Huh I have not heard of Young Tableaux counting before. Thanks for the link. Also I hope its clear that this can also be speed up greatly, close to a factor of two, as this problem is symmetrical so f(a,b) will equal f(b,a), but this speedup is not really needed, at least not as this juncture.

    ReplyDelete