Love my blog? I'd be thankful for a gift from my geeky wishlist. Thanks!
This is the fourth post in an article series about MIT’s lecture course “Introduction to Algorithms.” In this post I will review lecture six, which is on the topic of Order Statistics.
The problem of order statistics can be described as following. Given a set of N elements, find k-th smallest element in it. For example, the minimum of a set is the first order statistic (k = 1) and the maximum is the N-th order statistic (k = N).
Lecture six addresses the problem of selecting the k-th order statistic from a set of N distinct numbers.
The lecture begins with discussing the naive algorithm for finding k-th order statistic — sort the given set (or array) A of numbers and return k-th element A[k]. Running time of such algorithm at best is O(n·log(n)).
Erik Demaine (professor) presents a randomized divide and conquer algorithm for this problem (my second post covers divide and conquer algorithms), which runs in expected linear time.
The key idea of this algorithm is to call Randomized-Partition (from previous lecture) subroutine on the given array and then recursively conquer on the partitioned elements on the left and right of the chosen pivot element.
Here is the pseudo code of this algorithm:
Randomized-Select(A, p, q, k) // finds k-th smallest element in array A[p..q]
if p == q
return A[p]
r = Randomized-Partition(A, p, q)
n = r - p + 1
if k == n
return A[r]
if k < n
return Randomized-Select(A, p, r-q, k)
else
return Randomized-Select(A, r+1, q, k-n)
The lecture continues with intuitive and then mathematically rigorous analysis of the algorithm. It is concluded that the expected running time of this algorithm is O(n) and the worst case (when the Randomized-Partition algorithm chooses bad pivots) is O(n2).
At the last minutes of the lecture a worst-case linear time order statistics algorithm is presented. This algorithm was invented by five scientists - M. Blum, R. W. Floyd, V. Pratt, R. Rivest and R. Tarjan. I found the original publication of this algorithm by these gentlemen: Time Bounds for Selection.
You’re welcome to watch lecture six:
Lecture six notes:
Topics covered in lecture six:
- [00:30] The problem of order statistics.
- [01:50] Naive algorithm for finding k-th smallest element.
- [04:30] Randomized divide and conquer algorithm.
- [11:20] Example of algorithm run on array (6, 10, 13, 5, 8, 3, 2, 11) to find 7-th smallest element.
- [15:30] Intuitive analysis of Randomized-Select algorithm.
- [20:30] Mathematically rigorous analysis of expected running time.
- [43:55] Worst-case linear time order statistics algorithm.
Have fun! The next post will be all about hashing!
PS. This course is taught from the CLRS book (also called “Introduction to Algorithms”):
Did you like this post? Subscribe here:
If you really enjoyed the post, I'd appreciate a gift from my geeky Amazon book wishlist. Books would make make me more educated and I would write even better posts. Thanks! :)

(5 votes, average: 4.6 out of 5)
|
|
|


August 28th, 2008 at 3:27 am
Thank you for posting this
August 28th, 2008 at 4:52 pm
The RandomizedPartition algorithm referred to from the previous lecture modifies the source data. This means that in the general case you would need to make a complete copy of the array argument.
Meanwhile, bin sorting at can be accomplished in linear time, which suggests an alternative approach for some cases: perform a bin sort against the list, then find the k-th value from that sorted list (this would be valid work when max(A) - min(A) was not significantly larger than length(A))
September 2nd, 2008 at 2:53 pm
Nice, if you’d like to download them and you don’t know how to, I’ve written a small post on my blog:
http://blog.srikanths.net/2007/11/introduction-to-algorithms.html
September 16th, 2008 at 7:04 pm
Can anyone answer the below question for me pleas??
Let X[1::n] and Y [1::n] be two arrays, each containing n numbers already in sorted order.
Give an O(lg n) time algorithm to find the nth and n+1st largest elements of the 2n elements
in arrays X and Y . Your algorithmic description must start with a brief overview of your
algorithm in English.
October 17th, 2008 at 5:23 pm
[…] Lecture 6: Order Statistics […]
November 19th, 2008 at 7:47 pm
can anyone answer the below question please?!
thanks in advance..
as seen in the lesson we divide our input in groups of 5 elements.
1)Is it possible to divide ‘em in groups of 7? will the result be the same?
2)and if i take groups of 3, why is the complexity O(nlgn) and not O(n)
Thanks
appriciate it.
May 16th, 2009 at 5:39 pm
[…] [upmod] [downmod] MIT’s Introduction to Algorithms, Lecture 6: Order Statistics - good coders code, great reuse (www.catonmat.net) 1 points posted 8 months, 3 weeks ago by SixSixSix tags algorithms imported […]