Microsoft doesn't understand DNS

A few days ago, Microsoft launched a DOS attack against millions of users of the dynamic DNS service NO-IP. Microsoft's aim was to disrupt DNS entries used by the creators of malware, but had the side effect of rendering the service effectively useless for anyone using DDNS for resolving servers, home automation systems, remote webcams etc.

Using a court order, Microsoft were able to replace the nameserver records for 22 domain names used by NO-IP and pointed them to their own nameservers. Microsoft's intent was presumably to answer legitimate DNS queries by forwarding them to the original NO-IP nameservers and block those from malicious entities, rather than completely crippling the service.

Microsoft have since claimed to have fixed the problem yet for many users, myself included I can no longer find machines I use dynamic DNS to resolve. Many are attributing this to DNS propagation delay, but the actuality is that Microsoft seem to lack to lack the technical competence to implement a such a filtering system.

As of writing (02/07/2014 11:38:25 BST 2014) cached DNS entries for NO-IP domains are completely broken, so let's use dig to directly query the nameservers (and hence avoid any DNS caching issues):

$ dig -tNS +trace

; <<>> DiG 9.9.2-P1 <<>> -tNS +trace
;; global options: +cmd
.                       163525  IN      NS
.                       163525  IN      NS
.                       163525  IN      NS
.                       163525  IN      NS
.                       163525  IN      NS
.                       163525  IN      NS
.                       163525  IN      NS
.                       163525  IN      NS
.                       163525  IN      NS
.                       163525  IN      NS
.                       163525  IN      NS
.                       163525  IN      NS
.                       163525  IN      NS
.                       515689  IN      RRSIG   NS 8 0 518400 20140709000000 20140701230000 8230 . p5nawXXuH07BoGUsETH3J3VEj7W6H6V1EzwfRIRkr5qepcJQoqyGuced WTeOWW4kZV8GfB0NPS4Rp8HBfNTR6CsxNf4da92kTtbJKh9P+xEtreyd z3RDMqKDDBHGEl2Taml6J5Yhy89gsbigAZKPammqKh2aZM9+Tz46OmPt GHg=
;; Received 913 bytes from in 16 ms

org.                    172800  IN      NS
org.                    172800  IN      NS
org.                    172800  IN      NS
org.                    172800  IN      NS
org.                    172800  IN      NS
org.                    172800  IN      NS
org.                    86400   IN      DS      21366 7 1 E6C1716CFB6BDC84E84CE1AB5510DAC69173B5B2
org.                    86400   IN      DS      21366 7 2 96EEB2FFD9B00CD4694E78278B5EFDAB0A80446567B69F634DA078F0 D90F01BA
org.                    86400   IN      RRSIG   DS 8 1 86400 20140709000000 20140701230000 8230 . F7mv0vQZqoDHVnIGnms53kiVT4nEwfxPv7ebMixzb20tI/FjH9nUrgvy PvrkzgXYV+HmO0Xzay/bsdLLhE2nMpFY7aXhbmzav9C126UGEAaheUDB 4MuTltNzmpu01biTVxyfqr6ZueE7QMWaKia/l29KdBab/3UgN7M3UL5e wr0=
;; Received 683 bytes from in 8 ms              86400   IN      NS              86400   IN      NS 86400 IN NSEC3 1 1 1 D399EAAB H9PARR669T6U8O1GSG9E1LMITK4DEM0T NS SOA RRSIG DNSKEY NSEC3PARAM 86400 IN RRSIG NSEC3 7 2 86400 20140723104131 20140702094131 21185 org. djVZn31Z2Fbpk8Wnj0QQ2HGfkZj/tU9UWhJIEViDbvPaKHfqHRVYnBLc 0n+s04e1uuZJpxmGOjIw6+aTJrxP/t4H525GtS5YLT/TeMQyK5Tq8dKN fUq0lpUqz0fhz2H8QhfHPpZDBCy1Udh29gPmAbXWb84yhEgra7jFObK5 VGA= 86400 IN NSEC3 1 1 1 D399EAAB VDBA62E7405UOAVCP3IU953TNPO52T45 A RRSIG 86400 IN RRSIG NSEC3 7 2 86400 20140722155512 20140701145512 21185 org. UAsAW27kIw8XUKxtz1nD4tsjF2uSf8ERmgZS02s4fb0ATIXbDY95Az+u 3Ai/iGDFjBxHJ1oJpEl8xat4IwHzx1/JEgVIheoOd6lklZbXFQRi7RAu E75yYUnnRyVCk1tqGBT3QpgPAM3U6dBkji0UZ/eAiL7CrTQk2Vzz3z8s cfY=
;; Received 594 bytes from in 156 ms              119749  IN      NS              119749  IN      NS
;; Received 117 bytes from in 142 ms

This shows that both the .org parent nameservers and the nameservers Microsoft have introduced both agree that the current nameservers for are located under Their IPs (, both belong to Microsoft. This means that Microsoft's “fix” must involve changes to their nameservers and have not returned nameserver control to NO-IP.

Let's lookup a DDNS host under the Microsoft nameserver, querying it directly (replaced by asterisks for paranoia):

$ dig -tA *****

; <<>> DiG 9.9.2-P1 <<>> -tA *****
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 20400
;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1

; EDNS: version: 0, flags:; udp: 4000
;*****               IN      A

*****        60      IN      A

;; Query time: 217 msec
;; WHEN: Wed Jul  2 11:49:05 2014
;; MSG SIZE  rcvd: 60

It works, that's weird. This indicates that the Microsoft nameserver is successfully forwarding requests to the underlying nameservers. Now let's make the same request to Google's public DNS server:

$ dig -tA ***** @

; <<>> DiG 9.9.2-P1 <<>> -tA ***** @
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: SERVFAIL, id: 60346
;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 1

; EDNS: version: 0, flags:; udp: 512
;*****               IN      A

;; Query time: 297 msec
;; WHEN: Wed Jul  2 11:52:09 2014
;; MSG SIZE  rcvd: 44

This fails with an error suggesting that the nameserver Google contacted failed to answer. Checking the nameserver entries show that the data is up-to-date, which means that Microsoft's nameservers are not replying to Google correctly.

My first thought was that this was some weird geographical issue, or maybe Microsoft had mistakenly identified Google's DNS requests as malicious traffic. Instead the answer is much more simple, and indicative of why Microsoft is simply incompetent.

There are two types of DNS server: recursive and non-recursive. Recursive DNS servers are typically contacted by end-users/clients, and perform the sometimes complicated set of lookups required to resolve a domain name. They almost always perform caching, in order to reduce load on non-recursive nameservers. Non-recursive name-servers serve only local data, and don't perform DNS lookups themselves. Organisations wishing to make DNS entries visible set up non-recursive DNS servers so that they can be contacted by recursive ones.

Within each DNS request, there is a “recursion desired” flag (RD). This tells the DNS server whether or not it should recursively perform the query. A recursive DNS server will perform requests with this bit unset since it's performing the recursion itself and the target DNS server ideally shouldn't support it.

Unfortunately Microsoft's DNS servers fail to deliver a useful reply if this bit is unset. Consequently, djbdns, Google's nameservers and many other recursive DNS servers will never get a useful response from the nameservers Microsoft have inserted.

Let's perform the same lookup directly to Microsoft's nameservers again, but this time disable recursion.

$ dig -tA +norecurse *****

; <<>> DiG 9.9.2-P1 <<>> -tA +norecurse *****
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 9751
;; flags: qr ra ad; QUERY: 1, ANSWER: 0, AUTHORITY: 6, ADDITIONAL: 7

; EDNS: version: 0, flags:; udp: 4000
;*****               IN      A

org.                    117492  IN      NS
org.                    117492  IN      NS
org.                    117492  IN      NS
org.                    117492  IN      NS
org.                    117492  IN      NS
org.                    117492  IN      NS

;; ADDITIONAL SECTION: 117492 IN      A 117492 IN      A 117492 IN      A 117492  IN      A 117492  IN      A 117492  IN      A

;; Query time: 149 msec
;; WHEN: Wed Jul  2 12:15:05 2014
;; MSG SIZE  rcvd: 278

We do get a response, but it contains no useful information. This demonstrates that Microsoft really has no idea what they're doing.

ICFP Competition 2013 Writeup


Having entered the International Conference on Functional Programming Competition for the last six years under the team name “Hacking in the Rain”, I thought it was about time I wrote about what I did.

This year's competition could effectively be summarised by the phrase “find the function” where the function mapped 64-bit integer values to other 64-bit integer values. In addition to the usual bitwise operators OR, AND, NOT, XOR, there was also addition, fixed offset shifts, conditionals and folds over all the bytes in a value with an arbitrary binary function.

As always, I chose to implement in C++, not because I dislike functional languages, but because I'm somewhat more proficient in C++ than in Haskell, and I prefer the control I get when needing to solve efficiency-critical problems with limited computing resources.

Overall Strategy

The most obvious approach here would have been to try to enumerate all programs of increasing size until finding one that matched all the given input-output pairs, combined with some sort of pruning. I chose not to do this because I was afraid that constructing and evaluating every problem via this technique would be too slow, and never scale to the larger problem sizes. I didn't attempt to handle the “if0” or “fold” operators in my initial solver.

My algorithm proceeded in two phases. In the first I computed metadata about constructable expressions and their sizes:

  1. Choose an input-output pair provided by the server.

  2. Build a set containing only the values of expressions of size one. Namely, 1, 0 and the input value.

  3. For increasing size, build sets containing all possible values of expressions of that size (excluding “if0” and “fold”). This only involves applying operators to elements of previously constructed sets and doesn't require any expression tree construction or evaluation.

  4. Terminate when one of the constructed sets contains the output value we're looking for.

As this point, I know the size of an expression tree that computes the correct output for the input value I chose. However, I do not know what the expression actually is.

The second phase builds the expression (or multiple expressions) that compute the output value from the input in a top-down manner. Given the desired output value and the size of the expression tree that computes it, I now find the tree itself:

  1. For every possible operator, determine if the value we want can be computed using the values we know we can compute from expressions with smaller sizes. We iterate over the sets we built in the previous phase to determine this. If yes, we know that this operator can form the topmost node in our tree.

  2. We now need to find expression trees of for any operands used by the parent operator. We repeat step one for all subexpressions required by the parent. Again, we already know their size and value.

  3. We terminate once we reach expressions of size 1.

Since there could be multiple expressions of the same size that evaluate to the same value (and I wanted to keep all of them) it was necessary to apply constructors across sets of values. I really missed being able to work in Haskell's list monad here which naturally handles Cartesian product-like constructions needed for the binary operator case.

Armed with sets of expressions, it was a simple matter to evaluate them on all other input-output pairs I had requested from the server to validate. If no expressions turned out to be correct, I used the remaining input-output pairs to generate more expressions.

This approach has the advantage that it only builds expression trees which are known to produce a correct answer for at least one input-output pair. In the event that no expressions were found that matched all input-output pairs, I added the option to deliberately construct oversized expressions. This was primarily used later for inferring conditionals for ifs.

For many cases, this strategy allowed me to find correct expressions in less than a second, especially with operator restrictions.

Handling “if0”

Although “if0” could appear at arbitrary points in the expression tree, I only looked at generating expressions with “if0” at the top. As ifs (excluding those within folds) can always be lifted, this seemed practical.

Though I could often find plausible expressions for sets of input-output pairs, none of these involved “if0” since trees were constructed for single input-output pair for which conditionals would always be constant.

To solve this, I saved all programs that correctly evaluated a subset of the input-output pairs. Once I had at least one valid program for every input-output pair, I approximated a minimal covering using a greedy algorithm to map each input-output pair to a correct program. I then used my existing code to infer the conditionals required to choose the correct program for each input-output pair. Depending on the number of programs used by the covering, this could involve a tree of nested “if0”s.

This approach did work, however, it has a couple of subtle issues that were difficult to debug:

  1. Any input-output pairs that were correctly computed by more than one program used by the covering needed to be removed. Without doing this, it was possible for ambiguous pairs to be assigned the wrong program, and the if-condition would become impossible to derive.

  2. The order of if-construction was important. As an example, assuming we were attempting to find this program:

    (lambda (id0) (if0 id0 1 id0))

    The following program (using _ as a placeholder) covers both sub-expressions, but it is extremely difficult to generate the condition without using another if:

    (lambda (id0) (if0 _ id0 1))

    This is a consequence of only having a bitwise NOT, and not a boolean NOT operator. To handle this I attempt to construct if conditions in all possible orders.


My final score was 701, which puts me at precisely 100th place. The unoffical scoreboard gives a nice graphical representation of the problems I managed to solve.

Final Thoughts

Although I never got around to handling folds, the strategy I implemented worked quite well for a number of problems. Far more than in other years, the lack of algebraic data types in C++ was painful to work around. I attribute this to both the number of possible node types in the tree, and the number of operations that needed to be applied to them. Of course, this is a good thing for a contest intended to advocate functional programming.

I have mixed feelings over the decision to have the solver run on the client side, rather than by the competition organisers after the competition. I spent a lot of time trying to improve my solver before letting it loose, and only did when there were around 8 hours left before the competition ended. I suspect the scoreboard might look somewhat different if myself and others had had the time to apply their program to all the given problems.

Updated 18/08/2013: Revised main algorithm description after feedback on reddit.

Another reason to ditch MythTV

MythTV is very pretty and featureful, but has always struck me as a project that cares more about making it further so than trying to create a robust, bug-free core. Today I came across the sort of bug that exemplifies this.

I decided to record the film “Unknown”, starring Liam Neeson as a man who wakes from a coma to discover his life has been stolen. Here we have the film in the programme listing view:

MythTV programme listing

Coincidentally, the MythTV front-end also lists time periods where it does not have any listings data as “Unknown”. The MythTV frontend rightly prevents the user from attempting to record these blocks.

MythTV no data

Why is this an issue? It turns out that the MythTV front-end prevents you from recording any entry entitled “Unknown”, making it impossible to schedule recording of the aforementioned film. Clicking on the entry does nothing. Bringing up the menu does give you the option to record, but fails to schedule it.

MythTV record attempt

The only way I found to schedule the recording was to do it from the HTML interface. Bad news for non-computer savvy Liam Neeson fans.

Just for the record, the front-end version was a 13/01/2013 build of the 0.26-fixes branch.

Updated 14/01/2013: Someone else apparently noticed around the same time.

The Amazon Kindle: wonderful device, shame about the books

The only time I tend to have to read is when I'm travelling to and from work. Unfortunately, it's rather inconvenient for me to have to remove and replace the book from and to my rucksack. This sometimes also leads to me damaging the book in question. Hence, my decision to purchase an Amazon Kindle.

The Kindle is certainly a nice device. It's reasonably light, and despite having a resolution of only 600x800, the properties of electronic ink and a pixel density of 167 ppi result in an impressive display. The user-interface can be a little clunky, but this is only an issue when you're doing something with the Kindle other than flipping the pages of a book.

For my first read, I decided I wanted a complete collection of all Sherlock Holmes novels and short stories written by Sir Arthur Conan Doyle. Surely, not an unreasonable request since "The Adventures of Sherlock Holmes" (the first of four collections of stories) is prominently featured by Amazon as one of the free classics available for the Kindle.

My first choice was to download them from Project Gutenberg and avoid any DRM-related issues. Alas, none of these had any images (many stories were originally illustrated by Sidney Paget when first published in The Strand magazine). Hence, I decided to purchase them from Amazon. I expected that it would be trivial to find a complete collection including original illustrations, ideally typeset for the Kindle, and not subject to typographical errors. I couldn't have been more wrong.

Finding such a book in paper form isn't an issue. However, searching for an equivalent ebook led me to reviews complaining about missing images, missing contents pages, missing lines, poorly typeset conversations, poorly scanned images and adverts for other books embedded within the text1. Note that most of these editions were non-free and it would be necessary to buy the ebook before spotting the issues, and again to get a revised version (if one was produced in the future).

The first problem is that searching Amazon for Sherlock Holmes ebooks reveals many, many, results and it's extremely difficult to distinguish between them. Amazon's site aggregates the reviews of all editions of a book together, regardless of publisher. As a result, it's rather difficult to find reviews for a specific Kindle edition of a book. Multiple Amazon customers seemed confused by which ebook edition was being described by which review. Unfortunately, if the quality of ebooks varies massively across editions, finding reviews for a specific edition becomes altogether more important.

The next problem is the lack of trustworthy publishers. I had hoped that I might be able to make a guess about the quality of the ebook based on its publisher. Publishers of physical books tend to have well defined reputations but typing many of the ebook publishers into a search engine revealed no other sign of their existence. Other books were simply listed as public domain or had no publisher at all. One publisher's website had vanished entirely. Of almost all the publishers I could find, there was no indication that they'd existed for more than a year or two.

This is perhaps, hardly surprising. In the UK, all the Sherlock Holmes novels and short stories have passed into the public domain. It doesn't take much work to imagine a business model to become an ebook publisher selling only classics.

  • Download out of copyright works from Project Gutenberg or other public domain resources.
  • Edit away undesired text and modify typography.
  • Create persuasive description and cover page, then sell on Amazon.

The best part is that creating an ebook has no physical printing cost. Of course, I have no way to know whether this is happening. What I do know is that the reviews suggest some extremely shoddy publishers. Given the difficulty of checking for issues before purchase and no way to return or get refunded for an ebook, it's probably rather easy to make a pretty penny.

However, there is spark of hope. An illustrated edition by Seashell Press seems to have had an unparalleled amount of effort placed into its creation and received excellent reviews. It's probably also worth noting that this is the only publisher I've seen that also appears to produce physical books.

Alas, it looks like this collection was originally complete, but then was revised to not contain "The Case-book of Sherlock Holmes" as it was still under copyright in the US. This change was also applied to the version available in the UK as well. I've since emailed the publishers to see if it is possible to get the original version distributed in the UK, where copyright of that particular collection is not an issue.

Amazon's use of DRM, remote content removal and the choice not to support EPUB indicate how much they wish to retain control of Kindle ebook distribution. As such, the quality of ebooks available from the Kindle store directly affects the utility of the Kindle, and it's sad to see issues like these that I never anticipated.

These are simply first impressions, and I have no idea if the issues I've described affect other books. If they do, they probably only exist for for older, out of copyright works. However, one of Amazon's main selling points for the Kindle seems to be that it provides an acceptable experience for reading classic literature. Just look at the Kindle's display every time it's switched off.

  1. For example, this free edition of The Adventures of Sherlock Holmes, currently at #3 in the Free Kindle store, contains character encoding errors. One such error is in the story "Adventure II. The Red-Headed League" which contains the text "there is now another vacancy open which entitles a member of the League to a salary of �4 a week for purely nominal services". The three strange characters should instead be a pound sign.

    In this non-free edition of "The Complete Sherlock Holmes Collection", I found that the links in the table of contents for the stories "The Adventure of a Case of Identity" and "The Adventure of the Read-Headed League" pointed to each other's stories instead of their own. It seems free of more major issues though.