Three Beautiful Quicksorts

Have you heard about a book called Beautiful Code? This book is composed of 33 essays by well-known computer scientists and software engineers on the same question - "What is the most beautiful piece of code you know?"

Chapter 3 of this book is written by Jon Bentley. The chapter is curiously titled "The Most Beautiful Code I Never Wrote" and it talks about a beautiful implementation and experimental analysis of the most widely used sorting algorithm in the world called Quicksort.

Jon also decided to give a talk at Google on the same topic. In this techtalk, he talks about three Quicksort topics. In the first part Jon discusses an elegant and tiny implementation of Quicksort. In the next part he analyzes the Quick sort algorithm by instrumenting this elegant implementation. Finally, he talks about a production quality Quicksort implementation which has been included in C standard library (qsort function).

What I learned the most from this lecture is the experimental part of counting number of comparsions of an algorithm. I had never read about experimental analysis of algorithms.

Direct link: http://video.google.com/videoplay?docid=-1031789501179533828

Interesting ideas from the lecture:

[02:30] What's the most beautiful code you've ever written?
[04:58] The classical Quicksort algorithm.
[05:32] Best case (nlog(n)), worst (n²) and average case (nlog(n)) analysis results of Quicksort.
[08:00] Randomization in Quicksort.
[09:08] Lomuto's idea of partitioning.
[10:48] Partitioning algorithm implementation pseudocode.
[11:10] Beautiful Quicksort code.
[12:13] Strengths of Quicksort: simple to implement and explain, pretty fast. Weaknesses of Quicksort: quadratic time for increasing elements (fix by randomization), quadratic time for equal elements, not stable.
[13:55] "vigorous writing is concise" /William Strunk Jr.'s/
[14:49] Simplest semi-production C Quicksort implementation.
[15:35] How much time will Quicksort take on average on indistinct elements?
[17:00] Graph of (runtime/N, logN) for Quicksort and Heapsort.
[18:31-32:00] Finding number of comparisons for Quicksort experimentally.
[23:08] Doing various tricks the time to count comparisons reduced from nlog(n) to n and space reduced from n to log(n).
[24:25] The Inventor's Paradox: "The more ambitious plan may have more chance of success."
[30:13] Perlis: "Simplicity does not precede complexity, but follows it."
[30:45] Mathematical solution to Quicksort's comparison complexity.
[31:35] The solution is ~ 1.386nlog(n)
[32:10] Analysis of comparisons in Quicksort is isomorphic to time taken to insert an element in a Binary Search Tree.
[36:25] A beautiful bug report on the original implementation of C library's qsort
[39:31] Engineering qsort together with Doug McIlroy. Goals and strategy.
[41:23] With equal elements, some Quicksorts go quadratic, some go nlog(n), some go linear.
[41:46] Is there a quick algorithm for ternary (<, = , >) partitioning?
[46:00] Knuth said that the overheard and comparisons took the same amount of time but the swaps really killed you, where as Jon and Doug found that time(overhead) < time(swaps) < time(comparisons), if swaps are done properly.
[46:30] Choosing effective partitioning element was important. First, middle or random gave runtime aproximately 1.386nlog(n), where as median of three or median and three medians gave run times of 1.188nlog(n) and 1.09nlog(n), respectivey.
[46:50] The final qsort had ternary partitioning, fast swap, adaptive sample for partition element, insertion sort for small files.
[47:40] The complete code of C library's qsort.
[48:15] Beauties of qsort (beautiful algorithm, beautiful interface, beautiful bug report, beautiful cost models, beautiful tests)
[50:40] Three beautiful Quicksorts - 1) a simple teaching tool, 2) an anaysis of Quicksort, 3) An industrial-strength Quicksort.
[51:35] Q and A.

Here is a working C program which runs Quicksort algorithm (in bold) from the Chapter 3 of the book. The algorithm sorts the global array of integers x.

#include &lt;stdio.h&gt;
#include &lt;stdlib.h&gt;
#include &lt;time.h&gt;

#define S(a) (sizeof(a)/sizeof(a[0]))

int x[] = { 56, 10, 39, 15, 20, 62, 14, 39, 77, 17 };

void swap(int i, int j) {
    int t = x[i];
    x[i] = x[j];
    x[j] = t;
}

int randint(int l, int u) {
    return rand()%(u-l+1)+l;
}

<strong>void quicksort(int l, int u) {
    int i, m;
    if (l &gt;= u) return;
    swap(l, randint(l, u));
    m = l;
    for (i = l+1; i &lt;= u; i++)
        if (x[i] &lt; x[l])
            swap(++m, i);
    swap(l, m);
    quicksort(l, m-1);
    quicksort(m+1, u);
}</strong>

int main() {
    int i;
    srand(time(NULL));
    quicksort(0, S(x)-1);
    for (i = 0; i &lt; S(x); i++) {
        printf(&#34;%d &#34;, x[i]);
    }
    printf(&#34;\n&#34;);
    return 0;
}

In a functional programming language, Quicksort can be written even more elegantly. Check out this Quicksort written in Haskell (thanks to bayareaguy.)

  quicksort (x:xs) = 
      quicksort [ i | i <- xs, i < x ] 
      ++ [x] ++
      quicksort [ i | i <- xs, i >= x ]

  quicksort [] = []

And what's the most beautiful code you've ever written?