One amusing thing in graph theory is graph traversal. Many of the interesting algorithms on graph
are ultimately based on the idea of iterating through the nodes of the graph in some order that is
related to the structure of the graph.
There are two fundamental orders of graph traversal, known as breadth-first and depth-first.
They're actually hard to explain in a comprehensible way - but they're pretty easy to understand by
example. So let's look at a graph, and do its depth-first and breadth-first traversals; then we'll look at
some pseudo-code to explain the details.
We'll start with a depth-first traversal from A. What we'll do is visit A, and then go to one of its children, doing a p depth first traversal of as much of the graph as we can from that child of A. So suppose we go to D first. Then
we'll start doing a depth-first traversal from D - so we go to M. From M we go to H. From H to N. From N we can't go anywhere else - so we go back to H, and go to its next child, which is L. L to Q; Q to J; J to P; P to O. Then we're at a dead end again. So we go back to P. Since O is already visited, P is a dead end. So we go back to J. From J, we can to go E. From E to F; F to C, C to B, B to I, I to K, and K to G. Then everything has been visited, and we're done. So the full depth-first traversal that we just did was A,D,M,H,N,L,Q,J,P,O,E,F,C,B,I,K,G. The depth-first does pretty much exactly what the name sounds like: it pushes deep into the graph as quickly as possible.
The breadth-first traversal is quite different. It first visits everything 1 step from its start; then 2 steps from its start; then 3; and it preserves path ordering it each level. What I mean by that will become clear from the example.
We start again with A. Then we visit each node adjacent to A: A, F, D. Then we visit each node adjacent to F: C, B, E.
Then we visit each node adjacent to D: M. Then each node adjacent to C; nothing unvisited yet. Then adjacent to B: I is unvisited. Then E: J. Then M: H, Q. Then I: K, G. Then J: P. Then H: N, L. Then we go through a bunch where there's nothing unvisited, until we get to P, which gives us O. So the full ordering is: A, F, D, C, B, E, M, I, J, H, Q, K, G, P, N, L, O.
To see what I mean about path ordering, let's look at the paths to some of the vertices in the breadth-first traversal: A, AF, AD, AFC, AFB, AFE, ADM, AFBI, AFEJ, ADMH, ADMQ, ... Since we visited F before we visited D, we'll always visit things on paths through F before we visit things on paths of the same length through D.
Another way of thinking about the traversals is as a way of generating a spanning tree for the graph. Here's a depth-first spanning tree for our example graph. As you can see, the depth first approach
generates a very long, very narrow tree with a small number of branches. That's what you'll generally see in a depth first traversal.
And here's a breadth-first spanning tree. The breadth first approach produces a much shallower,
bushier tree.
Finally, a bit of pseudocode, to give you the details:
Depth-First-Search(Graph, Vertex, Visited): visit(Vertex) add Vertex to Visited for each vertex vert adjacent to Vertex in G: if vert is not in Visited: Depth-First-Search(Graph, vert, Visited) end if end for end
Breadth-First-Search(Graph, Vertex, Visited) let queue = new Queue add Vertex to end of queue while queue is not empty: let v = remove-first(queue) if v is not in Visited: visit(v) add v to Visited for each vertex w adjacent to v in G: add w to end of queue end for end if end while end
- Log in to post comments
What I like about depth and breadth-first searches is that they're pretty much completely identical. The *only* difference is whether you add the new nodes to the beginning of the queue (depth-first) or the end (breadth-first).
A trivial modification later, and you've got an arbitrary heuristic traversal as well. Just add the nodes to the list however, and then resort the list according to some heuristic.
In other words, the only difference between the two is in the function addNodesToQueue(currentQueue, newNodes) which returns a new queue. Otherwise all three are the exact same code.
The only reason the two functions look different in Mark's pseudocode is that he used an optimization for the depth-first search that transformed it into a recursive function. If you make depth-first an iterative process like his breadth-first pseudocode you'll see the similarities.
For a fun time, try understanding and explaining lexicographic BFS, and in particular, why it deserves to be called a BFS.
BFS is also the cheapest (and much simpler than Dijkstra's) way to find the shortest distance between nodes if all the edges in a graph are of unit length.
Nice one Mark.
will you be doing, GBF, A* or type stuff? The examples make it easy to understand. Maybe even simulated annealing, GAs.. who knows?
I'm really enjoying these.
This is off-topic, but your diagrams look great. What are you using to produce them?
Right, but what I wanted to hear is why you find it amusing! Cause I thought that would be an interesting perspective, as I've always found it a relatively dry fact. But you didn't explain that. :)
Small mistake in the BFS tree: L should be a child of H, not a child of Q.
Love those pictures, btw.
Brendan.