Another really fun and fundamental weighted graph problem is the *shortest path* problem. The

question in: given a connected weighted graph G, what is the shortest path between two vertices

v and w in G?

To be precise, the weight of a path is just the sum of the weight of the edges that make up

the path. The shortest path between two vertices is the path with minimum weight.

There are a ton of solutions to this problem. I'm going to show one of the easiest to understand ones, which is Djikstra's algorithm. Djikstra's algorithm for the shortest path combines two very interesting

algorithmic techniques: greediness, and relaxation. We've already talked about greediness in the context

of the minimum spanning tree: a greedy algorithm is an algorithm which works by aggressively picking

optimum sub-values, and building the result from them. Relaxation is an algorithmic technique where

you start with a maximum value, and attempt to reduce it. The easiest way to understand relaxation

is by metaphor: suppose you want to find the optimal arrangement of a bunch of pegs connected by springs. A relaxation technique basically moves one peg in each step in a way that reduces the tension in at least one spring without increasing the tension in any of the others. So you're continually *relaxing* the tension in the springs, until you can't relax it any more.

In Djikstra's algorithm, you have a graph G=(V,E,W), and you want to find the shortest path

between two vertices a and b. You create a table dist(x) of the minimum distance from a to every vertex in V so that dist(x) is the distance from a to x, and you set the initial value of dist(a) to 0,

the initial value of dist(x) for all other vertices in E to infinity; and you create a priority list of vertices, ordered by their distance from a.

Then, in each step, you look for the vertex n in E that has the *smallest* value of

dist(n), and you remove that vertex from the priority list. Then, for each vertex m that is adjacent to n, if the path from a to m through n is shorter

than dist(m), set dist(m) = dist(n) + W(n,m). (This is called *relaxing* the path from a to m.)

When you get to the point that the next vertex selected in the priority list is b, then you're done - you know the shortest path.

In pseudo-code:

def shortestPath(G,a,b): dist = Map from vertices(G) to distance prev = Map from vertices(G) to vertices(G) for x in vertices(G): if x == a: dist[x] = 0 else: dist[x] = ∞ end priorityList = vertices(G) ordered by distance from a for closest = priorityList.nextVertex() != b: for v adjacent to closest: if dist[v] < dist(closest) + weight(closest,v): dist[v] = dist(closest) + weight(closest,v) prev[v] = closest end end end

When the algorithm terminates, you just trace back through prev to get the shortest path from a to b.

So what's the complexity of this?

- In the worst case, b could be the last vertex extracted from the priority list, in which case

we would end up running the outer loop once for each vertex. So there's a maximum of N iterations

of that loop. - In the inner loop, we need to look at every edge from the selected vertex. That's O(E).
- To re-order the priority list after removing an element from it takes, at most, N steps.

So, we're looking at O(|V|^{2}|E|).

That's not the best we can do; by adopting some hairy data structures for the priority queue,

we can reduce the running time to O(|V|log|V||E|). But the structures to do that have a pretty high

constants, so they only really pay off for large, sparse graphs.

Let's trace through an example. Suppose we want to find the shortest path from A to M in the following graph.

First, we set all weights to infinity, except A to A, all predecessors to empty, and we'll compute

the weights for the priority list. The following shows the initial state of all of these.

A | B | C | D | E | F | G | H | I | J | K | L | M | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|

Dist | 0 | ∞ | ∞ | ∞ | ∞ | ∞ | ∞ | ∞ | ∞ | ∞ | ∞ | ∞ | ∞ |

Pred | - | - | - | - | - | - | - | - | - | - | - | - | - |

Priority | 0 | - | - | - | - | - | - | - | - | - | - | - | - |

Now, we pick something off the priority list. The highest priority is 0, for A itself. So we pick A, and

from A, we can give distances and priorities to B and D.

A | B | C | D | E | F | G | H | I | J | K | L | M | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|

Dist | 0 | 3 | ∞ | 5 | ∞ | ∞ | ∞ | ∞ | ∞ | ∞ | ∞ | ∞ | ∞ |

Pred | - | A | - | A | - | - | - | - | - | - | - | - | - |

Priority | X | 3 | - | 5 | - | - | - | - | - | - | - | - | - |

Now, we greedily pick the node with the lowest priority value - B and process its edges.

A | B | C | D | E | F | G | H | I | J | K | L | M | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|

Dist | 0 | 3 | ∞ | 5 | 14 | ∞ | ∞ | ∞ | ∞ | ∞ | ∞ | ∞ | ∞ |

Pred | - | A | - | A | B | - | - | - | - | - | - | - | - |

Priority | X | X | - | 5 | 14 | - | - | - | - | - | - | - | - |

Next lowest priority is D, with 5.

A | B | C | D | E | F | G | H | I | J | K | L | M | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|

Dist | 0 | 3 | 12 | 5 | 14 | 13 | ∞ | ∞ | ∞ | ∞ | ∞ | ∞ | ∞ |

Pred | - | A | E | A | B | D | - | - | - | - | - | - | - |

Priority | X | X | 12 | X | 14 | 13 | - | - | - | - | - | - | - |

Next is C, with priority 12. C gives us a new path to F, but it's a path of length 14, which is longer than the path we already

found.

A | B | C | D | E | F | G | H | I | J | K | L | M | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|

Dist | 0 | 3 | 12 | 5 | 14 | 13 | ∞ | ∞ | ∞ | ∞ | ∞ | ∞ | ∞ |

Pred | - | A | E | A | B | D | - | - | - | - | - | - | - |

Priority | X | X | X | X | 14 | 13 | - | - | - | - | - | - | - |

Next is F, with priority 13. Then there's a tie, so we pick one randomly, and do E. Then G. After those steps, the table looks like:

A | B | C | D | E | F | G | H | I | J | K | L | M | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|

Dist | 0 | 3 | 12 | 5 | 14 | 13 | 14 | 18 | 17 | 18 | ∞ | ∞ | ∞ |

Pred | - | A | E | A | B | D | F | E | E | G | - | - | - |

Priority | X | X | X | X | X | X | X | 18 | 17 | 18 | - | - | - |

It continues on in the same way, until we get the result ADFGJKM with a total length of 26.

- Log in to post comments

It's been a while since I've used my big-O-fu. What's the order of operations for summing the geometric series of the weight matrix over the tropical semiring?

I'm sorry to interfere with your O-notation, but doesn't Dijkstra's algorithm take at most O((V+E)log V) steps using a simple heap? Each update to the heap is O(log V) if there are V vertices on the heap - and certainly there won't be any more. Each vertex gets put on the heap once and removed once, and the number of relaxations for all vertices totals to |E|, because obviously upon analyzing a certain edge, no more than one path can be shortened. My trusty old Cormen (Introduction to Algorithms) supports this. Also heaps have very low constants, so we're cutting the complexity by at least a factor of |V| compared to the naive implementation.

this is correct. and if you use the funkier fibonacci heaps, then the time goes down to E + V log V

I was trying to follow the explanation and stuck wondering whether the less than or equal sign is pointing the right way in the conditional statement:

if dist[v] < dist(closest) + weight(closest,v):

"for all other vertices in E"

shouldn't this be "for all other vertices in V"?

Ado 1ka nam pattai...