The XKCD book page numbering explained (or skew binary explained)

We got the XKCD book “volume 0″ here at work yesterday, and I have of course skimmed through it many times already. I quickly found the solution to the page numbering scheme, but wanted to see if I could find a simple conversion formula from real number to XKCD page number and vice versa.

Introduction

In a regular (i.e. useful) number system, the digit values are powers of a given number starting with the right-most digit representing a digit value of the given base number to the power of 0, the second digit from the right representing a digit value of the given base number to the power of 1 etc. This has a lot of advantages beyond the topic of this post. Thus, in a base 10 number system, the right-most digit value is 1, the second from the right is 10, then 100 etc. There is a simple rule for how we go from one digit value to the next and for how to find an arbitrary digit value just by knowing the digit position.

But the simple point I’m making here is, that we could invent a number system, where the digits represent arbitrary values – e.g. the first digit from the right represent 1s, the second digit represent 4s, the third represent 178s, the fourth represent 179s etc. We can express any number in this (we could just extend the available numbers with letters or other symbols as we see fit) – in the example given here, the decimal number 2000 would be represented as 11*179 + 0*178 + 7*4 + 3*1 = B073. It is quite a useless set of digit values, and it is a very hard representational number system to do any math in.

For the XKCD book page numbering, Randall has chosen a number system, where the first digit from the right represent 1s, the second digit represent 3s, the third represent 7s, the fourth represent 15s and so on – thus there is actually a rule for finding the digit value for the n‘th digit (from the right) = 2i-1 – compare this to the binary system, in which the digit values can be calculated as 2(i-1).

UPDATE 2009-11-26: As it has been pointed out to me, this number system (which is skewed one off binary) is called skew binary. The only interesting property that I can find anyone reference is the fact, that increment only ever has one carry-over, which makes increment very easy to implement (and I could thus implement it in regexp as given further below). It is however only a “scientific” numbering system, and I can find no reference to any (other) usage in the wild. The article linked before is the first Google hit for “skew binary” and it is an 11 year old slide – which indicates further, that it is a rarely used system with only theoretical purposes.

Conversions

So, how to find the value, that a given XKCD page number represent? Simply multiply each digit by the digit value – if the i‘th digit (where i=1 represent the right-most digit) has the value xi the translation of an n digit XKCD page number can be done via this formula:

General conversion rule from XKCD page numbers to decimal numbers

General conversion rule from XKCD page numbers to decimal numbers

The reverse translation can’t be done like regular conversion between regular bases bulding the number from the right (using modulus and division), but has to be done from the left. Thus we need to find the maximum i for which 2i-1 is less than the given number – which we do via log2(number+1). Then we go from this maximum i (which denotes the number of digits n in the XKCD page number) to 1 and for each step find the i‘th digit by evenly dividing the target number with 2i-1 and storing the remainder as the new target number – here both implemented in JavaScript:

function dec2xkcd(d) {
  var i = Math.floor(Math.log(d+1)/Math.LN2);
  var r, x = "";
  for (; i; i--) {
     r = Math.pow(2,i)-1;
     x += Math.floor(d/r);
     d = d%r;
  }
  return x;
}
function xkcd2dec(x) {
  var d = 0, l = x.length, i = l;
  while(i--) {
    d += parseInt(x.charAt(i))*(Math.pow(2,l-i)-1);
  }
  return d;
}

We can verify these by running some tests converting back and forth:

function c(d) {
  var x = dec2xkcd(d);
  var d_ = xkcd2dec(x);
  return d+" => "+x+" => "+d_;
}
c(1); // 1 => 1 => 1
c(6); // 6 => 20 => 6
c(15); // 15 => 1000 => 15
c(16); // 16 => 1001 => 16
c(119); // 119 => 111110 => 119
c(120); // 120 => 111111 => 120
c(6542); // 6542 => 110011001010 => 6542
c(5469835); // 5469835 => 1010011011101101001020 => 5469835

Properties

The selection of digit values are not completely random, they do follow some system, but not the systems of regular power-based number systems. So the properties, that regular power-based number systems have are not present in this, but it does have other properties.

First of all, the notion 2^n-1 is known from Mersenne primes. What this tells us however, is nothing useful about the system except that some digit values are prime – like 3, 7, 31 and maybe, maybe not an infinite number of the infinite other digit values.

Another property is, that from any digit value dn, the next digit value can always be calculated as 2dn + 1. This means, that the largest number the can be represented by n digits is a left-most 2 followed by n-1 0s – the next number in sequence is a left-most 1 followed by n 0s. But again, this is quite trivial and not really that useful.

The repunit numbers (numbers consisting solely of 1s) in this representational system represent the numbers 1 (1), 4 (11), 11 (111), 26 (1111), 57 (11111) and this order is actually the second column of the Eulerian numbers (defined by sequence A000295 in OEIS) – but this is still quite useless.

Algebra

So, is this usable as a number system? Can we do math in it? Can we in a simple way multiply numbers? Let’s start by making an algorithm for adding 1 to (incrementing) an arbitrary number. Observe, that we cannot simply safely increment the last digit if it is a zero – because if the number has a 2 in it (which only occurs in at most one position in the number and with no trailing 1′s), increment is done by converting the 2 to a 1 and incrementing the digit to the left of the 2 – examples:

  • Increment 2 => 10
  • Increment 12 => 20
  • Increment 102 => 110
  • Increment 120 => 200
  • Increment 1012 => 1020
  • Increment 2000 => 10000

It is not trivial, but then again, not that hard. For numbers without a 2, simply increment the right-most digit (which has to be a 0 incremented to a 1 or a 1 incremented to a 2).

Then, how do we subtract 1 (decrement)? Here the rule is, if the number has trailing 0s (one or more right-most digits are 0), decrement the right-most non-zero digit and change the left-most of the trailing 0s (the number just to the right of the right-most non-zero) to a 2 and keep the remaining as 0. If the number does not end in a 0, simply decrement the last digit – examples:

  • Decrement 2 => 1
  • Decrement 10 => 2
  • Decrement 12 => 11
  • Decrement 112 => 111
  • Decrement 1010 => 1002
  • Decrement 1020 => 1012
  • Decrement 10000 => 2000

Still not trivial, but not impossible. UPDATE 2009-11-26: The fact that you at most have to alter two connected digits when decrementing or incrementing is what makes this number system interesting as this is a lot easier to implement that normal “base-based” number systems, where you worst-case have to alter all digits of the number when in- or decrementing – compare incrementing the decimal number 999999.

Comparison of numbers is done exactly as for numbers in any (regular) base – if one has more digits that the other, the “longest” number is higher. If not, compare digit by digit starting from the left, if a digit is higher in one number, than the same in the other, that number is larger, if equal, go to next digit. If all digits are the same, the numbers are equal.

All right, so far so good. Then we can add and subtract (but as we won’t consider negative numbers, subtraction cannot be negative). We add x to y by incrementing y and decrementing x at the same time until x equals 0 – the value of y is then the result. For subtracting x from y (i.e. y - x), we decrement both numbers until either is zero – the value of y is then the result (thus 1 – 2 is 0, as we don’t do negative number).

The next logical step is to consider multiplication (of positive numbers only of course). We multiply x by y by creating a result variable r and add x to it y times (decrement y each time until 0).

Implementation

Note: this gets very pointless, so feel free to skip to the conclusion – which is that all of this is just silly and pointless.

First, increment and decrement in JavaScript:

function inc(x) {
  if (!/2/.test(x)) {
    return x
      .replace(/1$/, "2")
      .replace(/0$/, "1");
  }
  return x
    .replace(/(0|^)2/, "10")
    .replace(/12/ , "20");
}
function dcr(x) {
  if (!/0$/.test(x)) {
    return x
      .replace(/1$/, "0")
      .replace(/2$/, "1");
  }
  return x
    .replace(/20(0*)$/, "12$1")
    .replace(/10(0*)$/, "02$1")
    .replace(/^0+/, "");
}
function id_test(d) {
  var x = dec2xkcd(d);
  var x2 = inc(x);
  var x0 = dcr(x)
  return x0+" ("+xkcd2dec(x0)+") => "+x+" ("+d+") => "+x2+" ("+xkcd2dec(x2)+")";
}
id_test(1);   // 0 (0) => 1 (1) => 2 (2)
id_test(2);   // 1 (1) => 2 (2) => 10 (3)
id_test(6);   // 12 (5) => 20 (6) => 100 (7)
id_test(7);   // 20 (6) => 100 (7) => 101 (8)
id_test(117); // 111100 (116) => 111101 (117) => 111102 (118)
id_test(123); // 111120 (122) => 111200 (123) => 112000 (124)
id_test(127); // 200000 (126) => 1000000 (127) => 1000001 (128)
id_test(128); // 1000000 (127) => 1000001 (128) => 1000002 (129)
id_test(603); // 100101120 (602) => 100101200 (603) => 100102000 (604)

Then comparison:

function cmp(x1,x2) {
  var l = x1.length, l2 = x2.length;
  if (l != l2) {
    return l > l2 ? 1 : -1;
  }
  var i, c1, c2;
  for (i=0; i<l; i++) {
    c1 = x1.charAt(i);
    c2 = x2.charAt(i);
    if (c1 > c2) {
       return 1;
    }
    if (c2 > c1) {
       return -1;
    } 
  }
  return 0;
}
function cmp_test(d1, d2) {
  var x1 = dec2xkcd(d1);
  var x2 = dec2xkcd(d2);
  var c = cmp(x1, x2);
  if (c > 0) {
    return x1+" ("+d1+") is greater than "+x2+" ("+d2+")";
  }
  if (c < 0) {
    return x2+" ("+d2+") is greater than "+x1+" ("+d1+")";
  }
  return x1+" ("+d1+") equals "+x2+" ("+d2+")";
}
cmp_test(1,1);     // 1 (1) equals 1 (1)
cmp_test(1,2);     // 2 (2) is greater than 1 (1)
cmp_test(2,1);     // 2 (2) is greater than 1 (1)
cmp_test(117,117); // 111101 (117) equals 111101 (117)
cmp_test(117,121); // 111112 (121) is greater than 111101 (117)
cmp_test(127,128); // 1000001 (128) is greater than 1000000 (127)
cmp_test(127,126); // 1000000 (127) is greater than 200000 (126)

Then addition:

function add(x1, x2) {
  while (cmp(x2, "0")) {
    x1 = inc(x1);
    x2 = dcr(x2);
  }
  return x1;
}
function add_test(d1, d2) {
  var x1 = dec2xkcd(d1);
  var x2 = dec2xkcd(d2);
  var xs = add(x1, x2);
  var ds = xkcd2dec(xs);
  return x1+" ("+d1+") + "+x2+" ("+d2+") = "+xs+" ("+ds+")";
}
add_test(1, 1);   // 1 (1) + 1 (1) = 2 (2)
add_test(1, 2);   // 1 (1) + 2 (2) = 10 (3)
add_test(2, 1);   // 2 (2) + 1 (1) = 10 (3)
add_test(7, 7);   // 100 (7) + 100 (7) = 200 (14)
add_test(8, 17);  // 101 (8) + 1002 (17) = 1110 (25)
add_test(53, 17); // 11100 (53) + 1002 (17) = 100100 (70)

Then subtraction:

function sub(x1, x2) {
  while (cmp(x1, "0") && cmp(x2, "0")) {
    x1 = dcr(x1);
    x2 = dcr(x2);
  }
  return x1;
}
function sub_test(d1, d2) {
  var x1 = dec2xkcd(d1);
  var x2 = dec2xkcd(d2);
  var xs = sub(x1, x2);
  var ds = xkcd2dec(xs);
  return x1+" ("+d1+") - "+x2+" ("+d2+") = "+xs+" ("+ds+")";
}
sub_test(1, 1);   // 1 (1) - 1 (1) = 0 (0)
sub_test(1, 2);   // 1 (1) - 2 (2) = 0 (0)
sub_test(2, 1);   // 2 (2) - 1 (1) = 1 (1)
sub_test(7, 7);   // 100 (7) - 100 (7) = 0 (0)
sub_test(17, 8);  // 1002 (17) - 101 (8) = 102 (9)
sub_test(53, 17); // 11100 (53) - 1002 (17) = 10012 (36)

Finally multiplication:

function mul(x1, x2) {
  if (!cmp(x2, "0")) return x2;
  // x2 is not zero, go
  var r = x1;
  while (cmp(x2, "1")) {
    r = add(r, x1);
    x2 = dcr(x2);
  }
  return r;
}
function mul_test(d1, d2) {
  var x1 = dec2xkcd(d1);
  var x2 = dec2xkcd(d2);
  var xs = mul(x1, x2);
  var ds = xkcd2dec(xs);
  return x1+" ("+d1+") * "+x2+" ("+d2+") = "+xs+" ("+ds+")";
}
mul_test(1, 1);   // 1 (1) * 1 (1) = 1 (1)
mul_test(1, 2);   // 1 (1) * 2 (2) = 2 (2)
mul_test(2, 1);   // 2 (2) * 1 (1) = 2 (2)
mul_test(7, 7);   // 100 (7) * 100 (7) = 11010 (49)
mul_test(17, 8);  // 1002 (17) * 101 (8) = 1000102 (136)
mul_test(53, 17); // 11100 (53) * 1002 (17) = 111000101 (901)

Conclusion

As expected, it is just a silly and very useless number system – but fun to play around with though. If you however find other interesting properties of this representational system, please feel free to comment.

However, it should be clear to anyone, that this has nothing to do with a ternary number system just because the only digits are 0, 1 and 2 – it is closer related to the binary number system.

UPDATE 2009-11-26: Albeit still useless, the number system actually has the name “skew binary” as mentioned in the introduction – and thus has been the target of some scientific research previously.

No related posts.

Category: General, JavaScript, Trends 6 comments »

6 Responses to “The XKCD book page numbering explained (or skew binary explained)”

  1. davean

    Actually, it is called Skew Binary and it does have some useful properties for computer science: http://www.cl.cam.ac.uk/teaching/2004/IntroFuncProg/lecture08.html

    I take it you incremented what page you were on a lot?

  2. Barklund

    I think I updated the article with a reference to skew binary just a few seconds before your comment ticked in. Yes, it is an existing number system with the (sole) interesting property, that increment and decrement is very simple to implement (that sounds almost poetic).

    But that does not IMO justify using it for anything other than slides for a CS lecture :)

  3. MM

    WTF? shit hvor er det sort…. :))

  4. Mike

    You might be interested in this clock applet I wrote.
    http://blog.garritys.org/2009/12/rosetta-clock.html
    It does skew binary as well as some others.

  5. The reviews are in, xkcd: volume 0 is a winner | the breadpig blog

    [...] here’s a bonus review from Barklund.org, who explained the unique page-numbering system. var disqus_url = [...]

  6. Witek

    It is useless? It is uber usefull! Using sparse skew binary numbering system, allows to implement random access list in functional language with O(1) head, O(1) tail retrivial, and O(log n) any element lookup. It is better both than balanced binary tree, and than normal list!


Leave a Reply



Back to top

     

Get Adobe Flash playerPlugin by wpburn.com wordpress themes