Thursday, August 16, 2012

Problem 3 - Factoring Primes

Problem 3

The prime factors of 13195 are 5, 7, 13 and 29.

What is the largest prime factor of the number 600851475143 ?

This is the first problem I found with an interesting real world application.  Factoring primes is hard but multiplying out two primes is trivial.  That concept, called a one-way function, underpins modern cryptography.  The generation of large primes is a hard problem, but it is much harder to figure out which two primes were multiplied to make a given number than it is to generate those primes in the first place.  That's because modern computers are very, very good at multiplication and addition, but relatively slow at division.  Likewise, the brute force guessing of prime factors is very time consuming because there are a large number of primes.

All cryptography can be broken given enough time.  The trick is to make the math problem hard enough that the attacker gives up and does something else before he can decrypt your secrets.

Solution

Here's the code for Problem 3:
#!/usr/bin/python

def isprime(val): #Checks to see if a number is prime, brute force
    for x in range(2, int(val**0.5)+1):
        if val % x == 0: return False
    return True

def factor(val):  #Get the prime factors of a number
    if isprime(val): return [val]
    i = 2
    thesefactors = []
    tempval = val
    while (i <= tempval):
        if tempval%i == 0:
            thesefactors += [i]
            tempval = tempval/i
            i = 2
        else:
            i = i + 1
    if len(thesefactors) <= 1:
        return [val]
    else:
        return thesefactors

if __name__ == '__main__':
    import sys
    factors = factor(int(sys.argv[1]))
    print (factors)
This is run using the following command (with the correct number argument for the question asked):

python primes.py 600851475143

The correct answer is 6857.

Approach

Again, this is a brute force approach.  I have included a refinement called Newton's Improvement where you don't need to check if numbers larger than the square root of that number are factors.  This helps improve both subroutines.  These subroutines were both used in the solutions of later problems.

for x in range(2, int(val**0.5)+1):

Benchmarks

The question asked was too easy to make a good benchmark, so I created my own composite using the solution of a later problem involving counting primes.  The two primes I multiplied were 1299379 and 1299709.  The multiplication takes the blink of an eye, but the factoring takes many, many blinks.

As before, pypy is about 7-10x faster than Python2, with Jython being much slower and Python3 somewhere in the middle.

Problem #3 Benchmarks
time python primes.py 1688814580711
[1299379, 1299709]

real 0m0.329s
user 0m0.316s
sys 0m0.012s
time python -O primes.py 1688814580711
[1299379, 1299709]

real 0m0.380s
user 0m0.372s
sys 0m0.004s
time python3 primes.py 1688814580711
[1299379, 1299709]

real 0m0.715s
user 0m0.704s
sys 0m0.008s
time python3 -O primes.py 1688814580711
[1299379, 1299709]

real 0m0.772s
user 0m0.764s
sys 0m0.004s
time pypy primes.py 1688814580711
[1299379, 1299709]

real 0m0.083s
user 0m0.080s
sys 0m0.000s
time pypy -O primes.py 1688814580711
[1299379, 1299709]

real 0m0.083s
user 0m0.076s
sys 0m0.004s





Conclusion

The Problem 3 solution is a program with both real world uses and utility as a library for later problems.  The problem of locating and factoring primes is the holy grail of cryptography and this simple implementation gives some insight into prime numbers and the problem of factoring primes.