Apart from a brief visit many years ago to Macclesfield church with my friend Jo, I have never taken part in the ancient English art of bell-ringing, but permutation generation has always seemed a fascinating topic.

In TAOCP 7.2.1.2, Knuth has Algorithm P (“plain changes”) for generating all permutations using adjacent interchanges:

void taocp1(int *a, int n) { int c[n+1]; int o[n+1]; for (int j = 1; j <= n; j++) { c[j] = 0; o[j] = 1; } P2: visit(a,n); P3: int j = n; int s = 0; P4: int q = c[j] + o[j]; if (q < 0) goto P7; if (q == j) goto P6; P5: swap(a[j-c[j]+s],a[j-q+s]); c[j] = q; goto P2; P6: if (j == 1) return; s = s+1; P7: o[j] = -o[j]; j = j-1; goto P4; }

It works just fine, but I find it a little hard to follow and let’s face it, we just don’t write code like that anymore, with all those gotos, 1-based arrays, etc.

0-basing the arrays isn’t too much of a problem, for eliminating the gotos we could take the functional approach:

void taocp2(int *a, int n) { int c[n]; int o[n]; for (int j = 0; j < n; j++) { c[j] = 0; o[j] = 1; } P2(a,c,o,n); } void P2(int *a, int *c, int *o, int n) { visit(a,n); P3(a,c,o,n); } void P3(int *a, int *c, int *o, int n) { P4(a,c,o,n,n-1,0); } void P4(int *a, int *c, int *o, int n, int j, int s) { int q = c[j] + o[j]; if (q < 0) P7(a,c,o,n,j,s); else if (q == j+1) P6(a,c,o,n,j,s); else P5(a,c,o,n,j,s,q); } void P5(int *a, int *c, int *o, int n, int j, int s, int q) { swap(a[j-c[j]+s],a[j-q+s]); c[j] = q; P2(a,c,o,n); } void P6(int *a, int *c, int *o, int n, int j, int s) { if (j != 1) { P7(a,c,o,n,j,s+1); } } void P7(int *a, int *c, int *o, int n, int j, int s) { o[j] = -o[j]; P4(a,c,o,n,j-1,s); }

but somehow that seems missing the point (I don’t think that the worst thing about that code is that it still uses modifiable state).

There are actually two component parts to the algorithm: generating a “mixed radix reflected Gray code” which defines the progressive inversions of the array elements, and going from the inversion changes to the actual indexes of the swapped objects.

For generating the Gray codes, with a bit of rewriting, 0-basing our arrays etc. we get:

void gray(int n) { int j, c[n], o[n]; for (int j = 0; j < n; j++) { c[j] = 0; o[j] = 1; } do { visit(c,n); for (j = n-1; j > 0; j--) { int q = c[j] + o[j]; if (q < 0) { o[j] = -o[j]; } else if (q == j+1) { o[j] = -o[j]; } else { c[j] = q; break; } } } while (j != 0); }

Each time around, find the highest element whose inversion count can be changed in the desired direction. For elements whose inversion count can’t be changed, change direction. If no element can be changed we are done.

The next step is to calculate the position of element j – but this is just j less the number of elements less than j that appear after j (ie. the number of inversions, c[j]), plus the number of elements greater than j that appear before j – but this is just the number of elements we have been passed over in the “if (q == j+1)” step above, so we can now add in the rest of algorithm P:

void taocp3(int *a, int n) { int c[n]; int o[n]; for (int j = 0; j < n; j++) { c[j] = 0; o[j] = 1; } int j; do { visit(a,n); int s = 0; for (j = n-1; j > 0; j--) { int q = c[j] + o[j]; if (q < 0) { o[j] = -o[j]; } else if (q == j+1) { o[j] = -o[j]; s++; // This element will be before element j } else { swap(a[j-c[j]+s],a[j-q+s]); c[j] = q; break; } } } while (j != 0); }

Knuth’s algorithm essentially Dijkstra’s:

http://www.cs.utexas.edu/users/EWD/ewd05xx/EWD553.PDF

which explains it very lucidly: there is a 1-1 correspondence between inversion counts & permutations – and a Gray enumeration of inversion gives us a sequence of permutations where only 1 element at a time changes its inversion count, and only by 1 or -1, which can only be if we exchange adjacent elements.

Knuth also gives a “loopless” algorithm for generating reflected Gray sequences, and we could use a table of element positions to construct the permutations from this:

void gray2(int *m, int n) { int a[n],f[n+1],o[n]; for (int j = 0; j < n; j++) { a[j] = 0; f[j] = j; o[j] = 1; } f[n] = n; while (true) { visit(a,n); int j = f[0]; f[0] = 0; if (j == n) break; a[j] += o[j]; if (a[j] == 0 || a[j] == m[j]-1) { o[j] = -o[j]; f[j] = f[j+1]; f[j+1] = j+1; } } }

The “classic” Johnson-Trotter algorithm for plain changes involves consideration of “mobile” elements: each element has a current direction and it is “mobile” if if it greater than the next adjacent element (if there is one) in that direction. The algorithm proceeds by finding the highest mobile element and moving it accordingly:

void jt(int *a, int n) { int o[n]; // Direction of element i, 1 = left, 0 = right int c[n]; // Position of element i for (int j = 0; j < n; j++) { o[j] = 1; c[j] = j; } int j; // Will be set to the highest mobile element do { visit(a,n); // Search high to low. 0 is never mobile for (j = n-1; j > 0; j--) { int p = c[j]; // Position of element j if (o[j]) { // Going left int k = a[p-1]; if (p > 0 && k < j) { // Swap and adjust positions a[p] = k; c[k] = p; a[p-1] = j; c[j] = p-1; break; } } else { // Going right int k = a[p+1]; if (p < n-1 && k < j) { a[p] = k; c[k] = p; a[p+1] = j; c[j] = p+1; break; } } o[j] = !o[j]; // Change direction & continue } } while (j > 0); }

Note that we can check elements from high to low directly & stop as soon as a mobile element is found – there is no need to check every element.

Johnson-Trotter and Algorithm P are essentially doing the same thing: when moving the mobile element, we are changing its inversion count in the same way as we do directly in P.

Performance-wise, the two algorithms are very similar though (somewhat surprisingly to me) the Johnson-Trotter version seems a little faster.

Either way, my laptop generates the 3628800 permutations of 10 elements in 0.03 seconds, outputting them (even to /dev/null) takes 4.5 seconds. It takes about 3 seconds to run through the 479001600 permutations of 12 elements (I haven’t tried outputting them). We can streamline the computation further by taking advantage of the fact that most of the time we are just moving the highest element in the same direction, but that can wait for another day.

Finally, here is a function that gives the next permutation in plain changes without maintaining any state:

bool nextperm(int *a, int n) { int c[n], o[n]; for (int i = 0; i < n; i++) { c[i] = o[i] = 0; } // Count inversions for (int i = 0; i < n; i++) { for (int j = i+1; j < n; j++) { if (a[j] < a[i]) c[a[i]]++; } } // Find directions for (int i = 1, k = 0; i < n-1; i++) { k += c[i]; o[i+1] = k%2; } // Find mobile element and move it. for (int j = n-1, s = 0; j > 0; j--) { if (!o[j]) { if (c[j] < j) { swap(a[j-c[j]+s],a[j-c[j]+s-1]); return true; } s++; } else { if (c[j] > 0) { swap(a[j-c[j]+s],a[j-c[j]+s+1]); return true; } } } return false; } void stateless(int *a, int n) { do { visit(a,n); } while (nextperm(a,n)); }

We use the Knuth/Dijkstra trick to avoid building a position table. I suspect that n-squared inversion counting can be improved, but we still come in at under second for all permutations of 10, though that’s about 30 times slower than the optimized version.