Solving hard Maths Olympiad problems
Apparently it’s IMO season, and in honour of such, Terry Tao reposed one of the questions on his blog as a “polymath” project — something to be solved collaboratively over the web, rather than by individual effort. Two hundred comments or so later, and a complete proof was achieved, though how much that benefited from the polymath aspect might be debatable. Anyway, it’s a pretty neat problem:
Problem 6. Let a1, a2, …, an be distinct positive integers and let M be a set of n-1 positive integers not containing s = a1 + a2 +…+ an. A grasshopper is to jump along the real axis, starting at the point 0 and making n jumps to the right with lengths a1, a2, …, an in some order. Prove that the order can be chosen in such a way that the grasshopper never lands on any point in M.
Depending on your interest in such things, you might have fun trying to puzzle it out on your own. If so, you might or might not want to read the comment thread above for ideas, but be cautious about the comments to Terry’s followup post which include at least one complete proof.
Anyway, since I’m obviously not a l33t enough maths dude to come up with a proof myself, I thought it might be fun to see about converting the proof into a program (ie, “find a sequence of hops the grasshopper can take without landing on any point in M”). Something along the lines of this blog post from not too long ago. Happily it’s a constructive proof, so that’s an entirely reasonable challenge.
First, I’m going to generalise the problem slightly to make it a little easier to deal with: we’ll allow the grasshopper to start at any point, z, on the number line; and we’ll allow M to contain between zero and n-1 values, instead of exactly n-1 values, and we’ll sort the two sequences from least to greatest. (That turns out to not be any more general in practice, but will make the notation easier)
As such, we’ll define a function steps(a,m,z=0)
that takes the two sorted lists of numbers, and the starting position (defaulting to 0), and returns a list of steps that when taken misses everything in m. We’ll resolve it recursively, starting with the base case of m being empty, in which case we can just take the steps in order, without worrying about hitting a mine:
def steps(a,m,z=0): assert(sorted(a)==a and sorted(m)==m) assert(len(m) < len(a)) if m == []: return a
The ideal first step to take would be the longest possible — that’s the most likely to pass a bunch of values in m, and generally gets us moving as fast as possible. The key insight from the proof (IMO) is taking this approach, and looking at how that interacts with the various possible arrangements of the first few values of m. There are two important issues: one, the most obvious, is whether one of the values of m is hit immediately by taking the largest possible step; and two, whether the largest possible step actually passes any of the values of m. We continue on by establishing these possibilities:
a_max = a[-1] m_min = m[0] a_max_in_m = (z+a_max) in m # NB: an O(n) or O(lg(n)) step a_max_passed_any_m = (m_min < z+a_max)
The most complicated case turns out to be when a_max_in_m and a_max_passed_any_m are both True. Since we know there are at least two distinct values in m (m_min and z+a_max), we can approach that by considering two jumps together, a_other followed by a_max. This won’t hit (z+a_max), and there are at most n-2 other values in m, while there are n-1 values for a_other. So some pair of (z+a_other, z+a_other+a_max) has neither value in m, and avoids at least two members of m. That gives us an inductive step, though we have to check every element of a to find it:
if a_max_in_m and a_max_passed_any_m: for i,a_other in enumerate(a): if z+a_other not in m and z+a_other+a_max not in m: z_new = z+a_other+a_max a_remainder = a[:i]+a[i+1:-1] m_remainder = [m_i for m_i in m if m_i > z_new] return [a_other,a_max]+steps(a_remainder, m_remainder, z_new) assert(False) # unreachable
Having dealt with that, we have the nice result that removing the smallest member of m ensures that there’s no longer anything getting in the way of taking a_max as the next step. By doing that, we can recurse into a simpler problem, and manipulate the answer we get back to come up with a solution:
simpler = steps(a[:-1], m[1:], z+a_max)
Now we know this series of steps avoids every value of m with the possible exception of m_min, but we need to figure if it does hit m_min, and if so where. So let’s do that:
sofar = z+a_max for i, s in enumerate(simpler): if sofar >= m_min: break sofar += s
At this point we’ve either safely passed m_min, safely reached the end of our path, or hit m_min. Avoiding m_min is great: if we managed that, we’re done.
if sofar != m_min: return [a_max]+simpler
But if we didn’t, we’re still not too badly off: we have a sequence of steps, s[:i] that end on m_min, followed by steps s[i:] that safely avoid all the other values in m. But we can rearrange those steps to avoid all the mines entirely by taking step s[i] first, then steps s[1:i] and step s[0] and finally steps s[i+1:]. Since our first step, s[0], was the largest possible step, we know z+sum(s[i]+s[1:i]) is less than z+sum(s[0]+s[1:i])=m_min and thus that all those steps avoid any value in m, and we also know that z+sum(s[i]+s[1:i]+s[0])>m_min, because that is precisely z+sum(s[:i+1]) which is a step past m_min, and known to avoid all the other values of m. And at that point we’re really done:
return [simpler[i]] + simpler[0:i] + [a_max] + simpler[i+1:]
So there you have it. (That actually turned out slightly neater than the original proof in some respects, since a couple of the cases ended up merged)
Also somewhat interesting is where the assumptions of the problem make their way into the code. The fact that m doesn’t include the total sum is relied upon to ensure simpler[i] actually exists; that each value in a is distinct is relied upon to ensure a_max is actually larger than simpler[i]; and that there are fewer distinct values in m than distinct values in a is relied upon to apply the pigeon-hole principle in the first branch.
Anyway, pretty neat, I think!
Hmm, maybe you can make that slightly simpler (or, well, shorter anyway) by replacing the last three blocks of code with something along the lines of: