Follow me on Twitter for my latest adventures!
MIT's Introduction to Algorithms, Lectures 17, 18 and 19: Shortest Path Algorithms
This is the twelfth post in an article series about MIT's lecture course "Introduction to Algorithms." In this post I will review a trilogy of lectures on graph and shortest path algorithms. They are lectures seventeen, eighteen and nineteen. They'll cover Dijkstra's Algorithm, BreadthFirst Search Algorithm and BellmanFord Algorithm for finding singlesource shortest paths as well as FloydWarshall Algorithm and Johnson's Algorithm for finding allpairs shortest paths.
These algorithms require a thorough understanding of graphs. See the previous lecture for a good review of graphs.
Lecture seventeen focuses on the singlesource shortestpaths problem: Given a graph G = (V, E), we want to find a shortest path from a given source vertex s ∈ V to each vertex v ∈ V. In this lecture the weights of edges are restricted to be positive which leads it to Dijkstra's algorithm and in a special case when all edges have unit weight to Breadthfirst search algorithm.
Lecture eighteen also focuses on the same singlesource shortestpaths problem, but allows edges to be negative. In this case a negativeweight cycles may exist and Dijkstra's algorithm would no longer work and would produce incorrect results. BellmanFord algorithm therefore is introduced that runs slower than Dijkstra's but detects negative cycles. As a corollary it is shown that BellmanFord solves Linear Programming problems with constraints in form x_{j}  x_{i} <= w_{ij}.
Lecture nineteen focuses on the allpairs shortestpaths problem: Find a shortest path from u to v for every pair of vertices u and v. Although this problem can be solved by running a singlesource algorithm once from each vertex, it can be solved faster with FloydWarshall algorithm or Johnson's algorithm.
Lecture 17: Shortest Paths I: SingleSource Shortest Paths and Dijkstra's Algorithm
Lecture seventeen starts with a small review of paths and shortest paths.
It reminds that given a graph G = (V, E, w), where V is a set of vertices, E is a set of edges and w is weight function that maps edges to realvalued weights, a path p from a vertex u to a vertex v in this graph is a sequence of vertices (v_{0}, v_{1}, ..., v_{k}) such that u = v_{0}, v = v_{k} and (v_{i1}, v_{i}) ∈ E. The weight w(p) of this path is a sum of weights over all edges = w(v_{0}, v_{1}) + w(v_{1}, v_{2}) + ... + w(v_{k1}, v_{k}). It also reminds that a shortest path from u to v is the path with minimum weight of all paths from u to v, and that a shortest path in a graph might not exist if it contains a negative weight cycle.
The lecture then notes that shortest paths exhibit the optimal substructure property  a subpath of a shortest path is also a shortest path. The proof of this property is given by cut and paste argument. If you remember from previous two lectures on dynamic programming and greedy algorithms, an optimal substructure property suggests that these two techniques could be applied to solve the problem efficiently. Indeed, applying the greedy idea, Dijkstra's algorithm emerges.
Here is a somewhat precise definition of singlesource shortest paths problem with nonnegative edge weights: Given a graph G = (V, E), and a starting vertex s ∈ V, find shortestpath weights for all vertices v ∈ V.
Here is the greedy idea of Dijkstra's algorithm:
 1. Maintain a set S of vertices whose shortestpath from s are known (s ∈ S initially).
 2. At each step add vertex v from the set VS to the set S. Choose v that has minimal distance from s (be greedy).
 3. Update the distance estimates of vertices adjacent to v.
I have also posted a video interview with Edsger Dijkstra  Edsger Dijkstra: Discipline in Thought, please take a look if you want to see how Dijkstra looked like. :)
The lecture continues with an example of running Dijkstra's algorithm on a nontrivial graph. It also introduces to a concept of a shortest path tree  a tree that is formed by edges that were last relaxed in each iteration (hard to explain in English, see lecture at 43:40).
The other half of lecture is devoted to three correctness arguments of Dijkstra's algorithm. The first one proves that relaxation never makes a mistake. The second proves that relaxation always makes the right greedy choice. And the third proves that when algorithm terminates the results are correct.
At the final minutes of lecture, running time of Dijkstra's algorithm is analyzed. Turns out that the running time depends on what data structure is used for maintaining the priority queue of the set VS (step 2). If we use an array, the running time is O(V^{2}), if we use binary heap, it's O(E·lg(V)) and if we use Fibonacci heap, it's O(E + V·lg(V)).
Finally a special case of weighted graphs is considered when all weights are unit weights. In this case a singlesource shortestpaths problem can be solved by a the Breadthfirst search (BFS) algorithm that is actually a simpler version of Dijkstra's algorithm with priority queue replaced by a FIFO! The running time of BFS is O(V+E).
You're welcome to watch lecture seventeen:
Topics covered in lecture seventeen:
 [01:40] Review of paths.
 [03:15] Edge weight functions.
 [03:30] Example of a path, its edge weights, and weight of the path.
 [04:22] Review of shortestpaths.
 [05:15] Shortestpath weight.
 [06:30] Negative edge weights.
 [10:55] Optimal substructure of a shortest path.
 [11:50] Proof of optimal substructure property: cut and paste.
 [14:23] Triangle inequality.
 [15:15] Geometric proof of triangle inequality.
 [16:30] Singlesource shortest paths problem.
 [18:32] Restricted singlesource shortest paths problem: all edge weights positive or zero.
 [19:35] Greedy idea for ss shortest paths.
 [26:40] Dijkstra's algorithm.
 [35:30] Example of Dijkstra's algorithm.
 [43:40] Shortest path trees.
 [45:12] Correctness of Dijkstra's algorithm: why relaxation never makes mistake.
 [53:55] Correctness of Dijkstra's algorithm: why relaxation makes progress.
 [01:01:00] Correctness of Dijkstra's algorithm: why it gives correct answer when it terminates.
 [01:15:40] Running time of Dijkstra's algorithm.
 [01:18:40] Running time depending on using array O(V^2), binary heap O(E·lg(V)) and Fibonacci heap O(E + V·lg(V)) for priority queue.
 [01:20:00] Unweighted graphs.
 [01:20:40] BreadthFirst Search (BFS) algorithm.
 [01:23:23] Running time of BFS: O(V+E).
Lecture seventeen notes:
Lecture 18: Shortest Paths II: BellmanFord Algorithm
Lecture eighteen begins with recalling that if a graph contains a negative weight cycle, then a shortest path may not exist and gives a geometric illustration of this fact.
Right after this fact, it jumps to BellmanFord algorithm. The BellmanFord algorithm solves the singlesource shortestpaths problem in the general case in which edge weights may be negative. Given a weighted, directed graph G = (V, E) with source s and weight function w: E → R, the BellmanFord algorithm produces the shortest paths from s and their weights, if there is no negative weight cycle, and it produces no answer if there is a negative weight cycle.
The algorithm uses relaxation, progressively decreasing an estimate on the weight of a shortest path from the source s to each vertex v ∈ V until it achieves the actual shortestpath weight.
The running time of BellmanFord algorithm is O(VE). The lecture also gives a correctness proof of BellmanFord algorithm.
The other half of the lecture is devoted to a problem that can be effectively solved by BellmanFord. It's called the linear feasibility problem that it is a special case of linear programming (LP) problem, where there is no objective but the constraints are in form x_{i} <= w_{ij}. It is noted that BellmanFord is actually a simple case of LP problem.
The lecture ends with an application of BellmanFord to solving a special case of VLSI layout problem in 1 dimension.
You're welcome to watch lecture eighteen:
Topics covered in lecture eighteen:
 [00:20] A long, long time ago... in a galaxy far, far away...
 [00:40] Quick review of previous lecture  Dijkstra's algorithm, nonnegative edge weights.
 [01:40] Description of BellmanFord algorithm.
 [04:30] BellmanFord algorithm.
 [08:50] Running time of BellmanFord O(VE).
 [10:05] Example of BellmenFord algorithm.
 [18:40] Correctness of BellmanFord algorithm:
 [36:30] Linear programming (LP).
 [42:48] Efficient algorithms for solving LPs: simplex algorithm (exponential in worst case, but practical), ellipsoid algorithm (polynomial time, impractical), interior point methods (polynomial), random sampling (brand new, discovered at MIT).
 [45:58] Linear feasibility problem  LP with no objective.
 [47:30] Difference constraints  constraints in form x_{j}  x_{i} <= w_{ij}.
 [49:50] Example of difference constraints.
 [51:04] Constraint graph.
 [54:05] Theorem: Negative weight cycle in constraint means difference constraints are infeasible/unsatisfiable.
 [54:50] Proof.
 [59:15] Theorem: If no negative weight cycle then satisfiable.
 [01:00:23] Proof.
 [01:08:20] Corollary: BellmanFord solves a system of of m difference constraints on n variables in O(mn) time.
 [01:12:30] VLSI Layout problem solved by BellmanFord.
Lecture eighteen notes:


Lecture 19: Shortest Paths III: AllPairs Shortest Paths and FloydWarshall Algorithm
Lecture nineteen starts with a quick review of lectures seventeen and eighteen. It reminds the running times of various singlesource shortest path algorithms, and mentions that in case of a directed acyclic graphs (which was not covered in previous lectures), you can run topological sort and 1 round of BellmanFord that makes it find singlesource shortest paths in linear time (for graphs) in O(V+E).
The lecture continues allpairs shortest paths problem, where we want to know the shortest path between every pair of vertices.
A naive approach to this problem is run singlesource shortest path from each vertex. For example, on an unweighted graph we'd run BFS algorithm V times that would give O(VE) running time. On a nonnegative edge weight graph it would be V times Dijkstra's algorithm, giving O(VE + V^{2}lg(V)) time. And in general case we'd run BellmanFord V times that would make the algorithm run in O(V^{2}E) time.
The lecture continues with a precise definition of allpairs shortest paths problem: given a directed graph, find an NxN matrix (N = V), where each entry a_{ij} is the shortest path from vertex i to vertex j.
In general case, if the graph has negative edges and it's dense, the best we can do so far is run BellmanFord V times. Recalling that E = O(V^{2}) in a dense graph, the running time is O(V^{2}E) = O(V^{4})  hypercubed in number of vertices = slow.
Lecture then proceeds with a dynamic programming algorithm without knowing if it will be faster or not. It's too complicated to explain here, and I recommend watching lecture at 11:54 to understand it. Turns out this dynamic programming algorithm does not give a performance boost and is still O(V^{4}), but it gives some wicked ideas.
The most wicked idea is to connect matrix multiplication with the dynamic programming recurrence and using repeated squaring to beat O(V^{4}). This craziness gives O(V^{3}lgV) time that is an improvement. Please see 23:40 in the lecture for full explanation.
After all this the lecture arrives at FloydWarshall algorithm that finds allpairs shortest paths in O(V^{3}). The algorithm is derived from a recurrence that the shortest path from vertex i to vertex j is minimum of { shortest path from i to j directly or shortest path from i to k and shortest path from k to j }.
Finally the lecture explains Johnson's algorithm that runs in O(VE + V^{2}log(V)) time for sparse graphs. The key idea in this algorithm is to reweigh all edges so that they are all positive, then run Dijkstra from each vertex and finally undo the reweighing.
It turns out, however, that to find the function for reweighing all edges, a set of difference constraints need to be satisfied. It makes us first run BellmanFord to solve these constraints.
Reweighing takes O(EV) time, running Dijkstra on each vertex takes O(VE + V^{2}lgV) and undoing reweighing takes O(V^{2}) time. Of these terms O(VE + V^{2}lgV) dominates and defines algorithm's running time (for dense it's still O(V^{3}).
You're welcome to watch lecture nineteen:
Topics covered in lecture nineteen:
 [01:00] Review of singlesource shortest path algorithms.
 [04:45] Allpairs shortest paths by running singlesource shortest path algorithms from each vertex.
 [05:35] Unweighted edges: VxBFS = O(VE).
 [06:35] Nonnegative edge weights: VxDijkstra = O(VE + V^{2}lg(V)).
 [07:40] General case: VxBellmanFord = O(V^{2}E).
 [09:10] Formal definition of allpairs shortest paths problem.
 [11:08] VxBellmanFord for dense graphs (E = O(V^{2})) is O(V^{4}).
 [11:54] Trying to beat O(V^{4}) with dynamic programming.
 [19:30] Dynamic programming algorithm, still O(V^{4}).
 [23:40] A better algorithm via wicked analogy with matrix multiplication. Running time: O(V^{3}lg(V)).
 [37:45] FloydWarshall algorithm. Runs in O(V^{3}).
 [47:35] Transitive closure problem of directed graphs.
 [53:30] Johnson's algorithm. Runs in O(VE + V^{2}log(V))
Lecture nineteen notes:
Have fun with shortest path algorithms! The next post is going to be an introduction to parallel algorithms  things like dynamic multithreading, scheduling and multithreaded algorithms.
PS. This course is taught from the CLRS book (also called "Introduction to Algorithms"). Chapters 24 and 25, called "SingleSource Shortest Paths" and "AllPairs Shortest Paths" explain everything I wrote about here in much more detail. If these topics excite you, you may want to buy this awesome book:
Comments
These lectures are just awesome, MIT is a wet dream :)
I need that kind of support. Thank you.
Dr. Ranka Kulic
Hi Peteris,
Your posts are very useful, with your notes, and specially for the timecoded topics list. This helps me just study what is going to come for my exams... :)
Thanks... and Good luck to you...!!!
Thank you very much. The interview with Dijkstra was great!
Nice work. Keep it up.
Good work. It is a great help!
This guy is awesome. My professor is nowhere near as good as him at explaining these algorithms. Maybe that is why this guy teaches at MIT instead of some college that is not ranked number 1 in the country...
Dude, Your work is greatly appreciated.
gud work! its really helpful.thank you
In your lecture notes for lecture 19, I think you wrote a j in the recurrence on the top of the second page instead of a k. The line is min {Cij(k1), Cik(k1) + Ckj(k1)}. Other than that, thanks for posting these. It helps when you're not quite sure of what the teacher wrote on the chalkboard.
Sapping, thanks for spotting the error. Indeed, it's Cij(k1), Cik(k1) + Ckj(k1). It's easy to remember because Cik Ckj is how you multiply matrices. (the lecture then proceeds to solving transitive closure via strassen's matrix multiplication analogy).
Wow dude ! he is Awe
<wait>some ! Yeah AWESOME. Although he looks alittle bit uptight and stressed in teaching, Teaching method he's got is way rad man in comparison to other profs.
If you've got more of these diamonds I appreciate sharing it
Thanks pal.
Mike
respected sir,
sir a lot of thanks to provide this type lecture wise material for particular subject.sir if possible then please keep some good resolution scan copy of example which is discusses in class and provide some test paper.
thank sir
these notes'll be very useful for me in the future
very thankful to you all..
Your commentary was very helpful, especially your notes. I referred to them as I wrote my own notes and watched the video lectures. I took an algorithms course this semester through an online university, and they used the Baase book, which was very cryptic to read. I'm the type of learner that needs to take something apart and break it to understand how it works, so very handson and visual, so this was perfect. Thanks for breaking it for me, and helping me figure out how it's all put back together.
Leave a new comment