Tuesday, August 14, 2012

Problem 1 - Integer Multiples of 3 and 5


Problem 1

If we list all the natural numbers below 10 that are multiples of 3 or 5, we get 3, 5, 6 and 9. The sum of these multiples is 23. 
Find the sum of all the multiples of 3 or 5 below 1000.

This is a real softball just to get people hooked.  The code could easily be a single line, but I have separated things out a bit.  I'll take the time to discuss the various boilerplate in my code in this blog post because the problem itself doesn't require much discussion.

I have also run my first benchmarks on several different interpreters.  I experiment with the -O switch which runs some sort of optimizations before running the script.  Interpreters tried include, Python 2 & 3, Pypy, and Jython.  The only modifications needed to run in Python 3 were the use of the new Python 3 "print ()" syntax which also worked in all Python 2 flavors.

There is an Ubuntu Launchpad project where I will host all my solution code.  You can get your very own copy on a linux box with bzr installed using this command:

bzr branch lp:eulerplay

Or you can browse online at:
http://bazaar.launchpad.net/~brywilharris/eulerplay/trunk/files.

Solution

Here's the code for Problem 1:
#!/usr/bin/env python

#This program adds all the integer numbers divisible by 3 and 5
#below a selected maximum number.  It is a straight forward, 
#brute-force approach.

if __name__ == '__main__' :   #boilerplate

    import sys  #for command line input
    sum = 0;
    for i in range(0,int(sys.argv[1])): #int arg required
        if (i%3==0) or (i%5==0):  #divisible by 3 or 5?
            sum = sum + i #add to the sum
    print sum #spit it out at the end
This is run using the following command (with the correct number argument for the question asked):

python add35.py 1000

The correct answer is 233168.

Approach

This is a simple, brute-force solution.  I loop through the entire range and add up the numbers divisible by 3 and 5.  The modulo (%) operator lets me see which numbers are evenly divisible.

This program uses the simplest form of command line argument (using sys.argv), the boilerplate (__name__ == __main__ ) for use as a library on Windows and a simple for loop (for i in range(x,y)).  These are some of the only things a python programmer needs to learn to start writing code.  I'm  often pleasantly surprised when I don't know how to do something in Python and it just does what I mean on the first try.  Try that in Java some time.

The first line lets you invoke the script from the command line without specifying the interpreter. Instead of python add35.py 1000, you can just type ./add35.py 1000.  This line is not required when the interpreter is run manually, eg pypy add35.py 1000.
#!/usr/bin/env python
The sys library lets you access command line arguments in python. There are much fancier and more bullet proof ways of doing this, but none are quite this simple.
import sys
astring = sys.argv[1] #argv[0] is the script name
This is how python handles for loops:
for i in range(x,y):

Benchmarks

The following are some simple benchmarks using the unix "time" command.  Python can be run on a variety of interpreters without modification.   My 4-core Intel i7 processor positively spoils me for other machines  (Hopefully it's not bragging to say, "Your benchmarks may be slower").   Python 2.7.3 comes with my copy of Ubuntu 12.04.  I also tried Pypy version 2.7.2, Python 3.2.3 and Jython 2.5.1 running on Java 1.6.0_24.

Code runs in a single thread for all benchmarks in this example.  The "real"number is the wall clock time the program took to set up and run.  This includes loading the interpreter into memory, any optimizations, and the reporting of results.  Later programs may be multi-threaded, but this one didn't need that.

This simple example can be timed with the following command:

time python add35.py 1000
or
time python -O add35.py 1000

In this case, the -O (optimze) switch takes longer than running unoptimized:
bryan@myComputer:~/play$ time python add35.py 1000
233168

real 0m0.025s
user 0m0.024s
sys 0m0.000s
bryan@myComputer:~/play$ time python -O add35.py 1000
233168

real 0m0.071s
user 0m0.060s
sys 0m0.004s

Adding a few zeros makes the -O switch (just barely) worthwhile:
bryan@myComputer:~/play$ time python add35.py 100000000
2333333316666668

real 0m19.323s
user 0m18.457s
sys 0m0.772s
bryan@myComputer:~/play$ time python -O add35.py 100000000
2333333316666668

real 0m19.286s
user 0m18.417s
sys 0m0.792s

Python 3 took longer and required a modification to the print statement syntax.  print sum had to become print(sum).  It looks to me like Python 3 has a way to go before it starts getting optimized.
bryan@bryan-Aspire-V3-771:~/play$ time python3 -O add35.py 100000000

2333333316666668

real 0m22.498s
user 0m22.465s
sys 0m0.004s

On a lark I tried this in pypy.   Pypy is a python interpreter written in a version of python instead of c.  It seems like an interpreter written in an interpreted language would be the definition of clunky, but this isn't the case for pypy.  Pypy runs a lot faster in this simple case:
bryan@myComputer:~/play$ pypy -O add35.py 100000000
2333333316666668

real 0m2.833s
user 0m2.804s
sys 0m0.028s


Jython crashed with an out of memory error after 37 seconds for the number 100000000.  Jython fits my preconceived idea of an interpreter inside an interpreter much better than Pypy. I guess my jython benchmarks will be a bit limited.  Jython was able to solve the original problem, also shown below.
java.lang.OutOfMemoryError: java.lang.OutOfMemoryError: Java heap space

real 0m37.852s
user 1m48.999s
sys 0m1.732s

bryan@myComputer:~/play$ time jython add35.py 1000

233168



real 0m1.580s

user 0m3.444s
sys 0m0.136s



Conclusion

This is a straight-forward problem with a straight forward solution.  While a faster algorithm might be found (multiplying by threes and fives until you get to the target number might be faster) but it was not necessary.  The solution to the initial question above is 233168.  

Pypy is the fastest interpreter for this problem by a factor of almost 7x. Python 3 was  slower than Python 2 and Jython was very slow even for this simple problem.  Jython ran out of memory and crashed when working with the bigger number.