Sat, 17 Aug 2013 15:17
Having entered the International Conference on Functional Programming Competition for the last six years under the team name “Hacking in the Rain”, I thought it was about time I wrote about what I did.
This year's competition could effectively be summarised by the phrase “find the function” where the function mapped 64-bit integer values to other 64-bit integer values. In addition to the usual bitwise operators OR, AND, NOT, XOR, there was also addition, fixed offset shifts, conditionals and folds over all the bytes in a value with an arbitrary binary function.
As always, I chose to implement in C++, not because I dislike functional languages, but because I'm somewhat more proficient in C++ than in Haskell, and I prefer the control I get when needing to solve efficiency-critical problems with limited computing resources.
The most obvious approach here would have been to try to enumerate all programs of increasing size until finding one that matched all the given input-output pairs, combined with some sort of pruning. I chose not to do this because I was afraid that constructing and evaluating every problem via this technique would be too slow, and never scale to the larger problem sizes. I didn't attempt to handle the “if0” or “fold” operators in my initial solver.
My algorithm proceeded in two phases. In the first I computed metadata about constructable expressions and their sizes:
Choose an input-output pair provided by the server.
Build a set containing only the values of expressions of size one. Namely, 1, 0 and the input value.
For increasing size, build sets containing all possible values of expressions of that size (excluding “if0” and “fold”). This only involves applying operators to elements of previously constructed sets and doesn't require any expression tree construction or evaluation.
Terminate when one of the constructed sets contains the output value we're looking for.
As this point, I know the size of an expression tree that computes the correct output for the input value I chose. However, I do not know what the expression actually is.
The second phase builds the expression (or multiple expressions) that compute the output value from the input in a top-down manner. Given the desired output value and the size of the expression tree that computes it, I now find the tree itself:
For every possible operator, determine if the value we want can be computed using the values we know we can compute from expressions with smaller sizes. We iterate over the sets we built in the previous phase to determine this. If yes, we know that this operator can form the topmost node in our tree.
We now need to find expression trees of for any operands used by the parent operator. We repeat step one for all subexpressions required by the parent. Again, we already know their size and value.
We terminate once we reach expressions of size 1.
Since there could be multiple expressions of the same size that evaluate to the same value (and I wanted to keep all of them) it was necessary to apply constructors across sets of values. I really missed being able to work in Haskell's list monad here which naturally handles Cartesian product-like constructions needed for the binary operator case.
Armed with sets of expressions, it was a simple matter to evaluate them on all other input-output pairs I had requested from the server to validate. If no expressions turned out to be correct, I used the remaining input-output pairs to generate more expressions.
This approach has the advantage that it only builds expression trees which are known to produce a correct answer for at least one input-output pair. In the event that no expressions were found that matched all input-output pairs, I added the option to deliberately construct oversized expressions. This was primarily used later for inferring conditionals for ifs.
For many cases, this strategy allowed me to find correct expressions in less than a second, especially with operator restrictions.
Although “if0” could appear at arbitrary points in the expression tree, I only looked at generating expressions with “if0” at the top. As ifs (excluding those within folds) can always be lifted, this seemed practical.
Though I could often find plausible expressions for sets of input-output pairs, none of these involved “if0” since trees were constructed for single input-output pair for which conditionals would always be constant.
To solve this, I saved all programs that correctly evaluated a subset of the input-output pairs. Once I had at least one valid program for every input-output pair, I approximated a minimal covering using a greedy algorithm to map each input-output pair to a correct program. I then used my existing code to infer the conditionals required to choose the correct program for each input-output pair. Depending on the number of programs used by the covering, this could involve a tree of nested “if0”s.
This approach did work, however, it has a couple of subtle issues that were difficult to debug:
Any input-output pairs that were correctly computed by more than one program used by the covering needed to be removed. Without doing this, it was possible for ambiguous pairs to be assigned the wrong program, and the if-condition would become impossible to derive.
The order of if-construction was important. As an example, assuming we were attempting to find this program:
(lambda (id0) (if0 id0 1 id0))
The following program (using _ as a placeholder) covers both sub-expressions, but it is extremely difficult to generate the condition without using another if:
(lambda (id0) (if0 _ id0 1))
This is a consequence of only having a bitwise NOT, and not a boolean NOT operator. To handle this I attempt to construct if conditions in all possible orders.
Although I never got around to handling folds, the strategy I implemented worked quite well for a number of problems. Far more than in other years, the lack of algebraic data types in C++ was painful to work around. I attribute this to both the number of possible node types in the tree, and the number of operations that needed to be applied to them. Of course, this is a good thing for a contest intended to advocate functional programming.
I have mixed feelings over the decision to have the solver run on the client side, rather than by the competition organisers after the competition. I spent a lot of time trying to improve my solver before letting it loose, and only did when there were around 8 hours left before the competition ended. I suspect the scoreboard might look somewhat different if myself and others had had the time to apply their program to all the given problems.
Updated 18/08/2013: Revised main algorithm description after feedback on reddit.