Shreyas Minocha

Writing a fact generator based on xkcd 1930

I really love xkcd. Randall has an amazing sense of humour, even if a tad bit geeky.

Xkcd 1930

Xkcd 1930 — Calendar Facts

The moment I saw this comic, it struck me: “I’ve to write a program for this!” I decided to make this a command line utility.

Programming languages are like tools in a mechanic’s tool box. Each has its own perks and each is suited to solving a different kind of problem.

Initially I decided to go with shell for this one.

Clearly, the sentences that can be formed by following the flowchart have the format “Did you know that [event] [unusual way] because of [reason]? Apparently [wild card statement]”. So we make four arrays with strings as. For example:

"the (fall|spring) equinox",
"the (winter|summer) (solstice|Olympics)",
"the (earliest|latest) (sunrise|sunset)",
"daylight (saving|savings) time",
"leap (day|year)",
"Easter",
"the (harvest|super|blood) moon",
"Toyota truck month",
"shark week"

Each of the sub-choices would have the form (choice|choice|choice). There are also some complex cases in the comic such as drifts out of sync with the (sun|moon|zodiac|(Gregorian|Mayan|lunar|iPhone) calendar|atomic clock in Colorado). We will have to deal with nested parentheses as well. I was pretty stuck up on using this approach because it avoids the redundancy of hard coding each of the sub-choices as follows

"the fall equinox",
"the spring equinox",
"the winter solstice",
"the winter Olympics",
"the summer solstice",
"the summer Olympics",
// ...

…and because it is intuitive(especially if you’re familiar with regular expressions, aka ‘regex’).

Before working out a regular expression to extract the choices from the parentheses, I did some research as to how I would be able to use the regex with shell script. I found a similar stackoverflow question(which didn’t even raise the nested brackets issue) but the answers and comments either gave complicated ways to solve the problem or suggested the use of another language. So I switched to javascript(nodejs).

Initially, I made an attempt to write a regular expression to match the outermost parentheses. For example, in drifts out of sync with the (sun|moon|zodiac|(Gregorian|Mayan|lunar|iPhone) calendar|atomic clock in Colorado), I would attempt to match (sun|moon|zodiac|(Gregorian|Mayan|lunar|iPhone) calendar|atomic clock in Colorado). This proved to be challenging. I couldn’t get my regular expression to ensure that inner parentheses are matched. For example, I could get it to match only (sun|moon|zodiac|(Gregorian|Mayan|lunar|iPhone) in our example. I was close to giving up on the regex idea altogether when a different approach struck me while playing around on regexr. Rather than attempting to match the parentheses outside-in, I would match it inside-out. Matching the innermost choice first, replacing the entire parentheses with a random choice within it.

So now we need to figure out how to extract the choices between the parentheses, split them by the ‘|’ delimiter, choose one of the strings from the resulting array and replace the matched portion with the chosen string. All this while ensuring that it works with nested parentheses. I opened a regex playground and started experimenting. Eventually, I came up with \(([\w\-|\'\ \d]+?)\). I highly recommend that you play around with this regular expression in this demo, especially if you aren’t familiar with the regex syntax. Pay attention to the capture groups there, we’ll use them soon.

We now have to write simplify(string) which would accept drifts out of sync with the (sun|moon|zodiac|(Gregorian|Mayan|lunar|iPhone) calendar|atomic clock in Colorado), for instance, and would return drifts out of sync with the sun.

/**
 * Simplifies a string resolving brackets inside-out
 * @param  {String} string string to be simplified
 * @return {String}        simplified string
 */
function simplify(string) {}

Have a look at the documentation for javascript’s string.replace. Go through the entire page but pay special attention to the ‘Specifying a function as a parameter’ section. Optionally, check out string.match too. We will know resolve each choice till the regex no longer matches the passed string using string.replace(regex, func). Let’s first try writing the replacement function(the one that we will pass as a parameter to string.replace). We assume that we have a randomElement(array) function already defined.

function replacer(match, choices) {
    return randomElement(choices.split("|"));
}

The replace function will pass the match and choices parameters(you must know what these are if you went through the string.replace documentation I referred to earlier). We split the choices capture group, by the | delimiter, into an array and return a random element of the same.

Here’s the final simplify function. We convert the replacer function we wrote earlier into an ES6 style arrow function.

function simplify(string) {
    const remainingChoices = /\(([a-zA-Z\d'\| \-]+?)\)/g;

    while (string.match(remainingChoices)) {
        string = string.replace(remainingChoices, (match, choices) =>
            randomElement(choices.split("|"))
        );
    }

    return string;
}

If you aren’t familiar with the const keyword, have a look at this article about ES6 for an overview of some of the most important ES6 features.

With this function done, the rest of the program is trivial. Feel free to have a shot at it. Earlier in this article, I had mentioned writing this as a command line utility. The details of writing a command line utility in javascript are outside the scope of this article. You can poke around with the entire source code for this program on Github.

Shreyas' Avatar

Shreyas Minocha