# Graph Theory

In my last post on game theory, I said that you could find an optimal
probabilistic grand strategy for any two-player, simultaneous move, zero-sum game.
It's done through something called linear programming. But linear programming
is useful for a whole lot more than just game theory.
Linear programming is a general technique for solving a huge family of
optimization problems. It's incredibly useful for scheduling, resource
allocation, economic planning, financial portfolio management,
and a ton of of other, similar things.
The basic idea of it is that you have a linear function,
called an…

As promised, today I'm going to talk about how to compute the strongly connected components
of a directed graph. I'm going to go through one method, called Kosaraju's algorithm, which is
the easiest to understand. It's possible to do better that Kosaraju's by a factor of 2, using
an algorithm called Tarjan's algorithm, but Tarjan's is really just a variation on the theme
of Kosaraju's.
Kosaraju's algorithm is amazingly simple. It uses a very clever trick based on the fact that
if you reverse all of the edges in a graph, the resulting graph has the same strongly connected components as the…

Colored Petri Nets
The big step in Petri nets - the one that really takes them from a theoretical toy to a
serious tool used by protocol developers - is the extension to colored Petri nets (CPNs). Calling them "colored" is a bit of a misnomer; the original idea was to assign colors to tokens, and allow color-based expressions on the elements of the net. But study of analysis models quickly showed
that you could go much further than that without losing the basic analyzability properties
that are so valuable. In the full development of CPNs, you can associate essentially arbitrary
collections…

There's one variant of Petri nets, called counted Petri nets, which I'm fond of for personal reasons. As Petri net variants go, it's a sort of sloppy but simple one, but as I said, I'm fond of it.
As a warning, there's a bit of a diatribe beneath the fold, as I explain why I know about this obscure, strange Petri net variant.
My first project in grad school involved building a simulator for a network protocol specification language called Estelle. Estelle is, as I like to describe it, a formal specification language with absolutely no useful formal qualities.
The problem with Estelle is…

Among many of the fascinating things that we computer scientists do with graphs
is use them as a visual representation of computing devices. There are many subtle problems that can come up in all sorts of contexts where being able to see what's going on can make a huge difference. Graphs are, generally, the preferred metaphor for most computing tasks. For example, think of finite state machines, flowcharts, UML diagrams, etc.
One of the most interesting cases of this for one of the biggest programming problems today is visual representations of concurrency. Concurrent program is incredibly…

Yet another example of how graphs can be used as models to solve real problems comes from the world of project management. I tend to cringe at anything that involves management; as a former IBMer, I've dealt with my share of paper-pushing pinheaded project managers. But used well, this demonstrates using graphs as a model for a valuable planning tool, and a really good example of how to apply math to real-world problems.
Project managers often define something called PERT charts for planning and scheduling a project and its milestones. A PERT chart is nothing but a labeled, directed graph…

There's a kind of graph which is very commonly used by people like me for analysis applications, called a lattice. A lattice is a graph with special properties that make it
extremely useful for representing information in an analysis system.
I've mentioned before that you can use a graph G=(V,E) to describe a partial order over its vertices, where given two vertices a,b∈V, a<b if and only if either (a,b) ∈E, or there exists
some vertex c∈V such that a<c and c<b. If this is true, G is called a partial order graph (POG).
If we look at POGs, we can create a kind of completion…

Last time, I showed a way of using a graph to model a particular kind of puzzle via
a search graph. Games and puzzles provide a lot of examples of how we can use graphs
to model problems. Another example of this is the most basic state-space search that we do in computer science problems.
In terms of a game, a state space is a set of equivalence classes over all possible board positions. For example, in chess, the state of a game at any point is represented by a set of mappings that assign either a position on the board or "captured" to each piece. The state space of chess consists of all…

As I've mentioned before, the real use of graphs is as models. Many real problems can be
described using graphs as models - that is, to translate the problem into a graph, solve some problem on the graph, and then translate the result back from the graph to the original problem. This kind of
solution is extremely common, and can come up in some unexpected places.
For example, there's a classic chess puzzle called the Knight's tour.
In the Knight's tour, you have a chessboard, completely empty except for a knight on one square. You can
move the knight the way you normally can in chess, and…

Suppose you've got a huge graph - millions of nodes. And you know that it's not connected - so the graph actually consists of some number of pieces (called the connected components of the graph). And there are constantly new vertices and edges being added to the graph, but nothing is ever removed. Some questions you might want to ask about this graph at a particular point in time are:
How many components are there in the graph?
Which component is vertex X in?
Are vertices X and Y in the same component?
How many components are there?
All of these questions are variants of a classic…

The next interesting variant on graphs is directed graphs, or digraphs for
short. A digraph is a graph where each edge distinguished between its source and its target - so an edge is from one node, and to another node. Unlike a simple graph, where if A is adjacent to B, then you can follow the edge either from A to B, or from B to A, in a directed graph, A to B and B to A are different edges.
A lot of things change when you're using digraphs instead of simple graphs. For example, the shortest
path: the shortest path from A to B is different than the shortest path from B to A. Look at the…

One of the reasons that I like about graph theory so much is because of
how it ties into computer science. Graphs are fundamental to many problems in
computer science, and a lot of the work in graph theory has direct implications for
algorithm design. It also has a lot of resonance for me, because the work I do involves
tons and tons of graphs: I don't think I've gotten through a week of work in the last decade without
implementing some kind of graph code.
Since I've described a lot of graph algorithms, and I'm going to
describe even more, today I'm going to talk a bit about how to…

One amusing thing in graph theory is graph traversal. Many of the interesting algorithms on graph
are ultimately based on the idea of iterating through the nodes of the graph in some order that is
related to the structure of the graph.
There are two fundamental orders of graph traversal, known as breadth-first and depth-first.
They're actually hard to explain in a comprehensible way - but they're pretty easy to understand by
example. So let's look at a graph, and do its depth-first and breadth-first traversals; then we'll look at
some pseudo-code to explain the details.
We'll start with…

Another really fun and fundamental weighted graph problem is the shortest path problem. The
question in: given a connected weighted graph G, what is the shortest path between two vertices
v and w in G?
To be precise, the weight of a path is just the sum of the weight of the edges that make up
the path. The shortest path between two vertices is the path with minimum weight.
There are a ton of solutions to this problem. I'm going to show one of the easiest to understand ones, which is Djikstra's algorithm. Djikstra's algorithm for the shortest path combines two very interesting
algorithmic…

I got started on this series about graph theory by writing a post debunking something foolish that George Gilder said about network theory. Today I'm going to talk about one of the classic problems in graph theory, which happens to be a network problem. It's called the maximum flow problem.
The idea is that you have a weighted graph, which is modeling a bunch of places connected by pipes of different sizes. The sizes of the pipes are represented by the weight labels on the edges, which
are called capacities. Pick two nodes in the graph, called the source and the sink, how much can you pump…

Moving on from simple graphs, one of the things you can do to make things more interesting
is to associate numeric values with the edges, called the weights of those edges. This variation
of a graph is called a weighted graph.
Weighted graphs are extremely useful buggers: many real-world optimization problems ultimately
reduce to some kind of weighted graph problem. A few examples include:
The traveling salesman problem (TSP): given a set of cities, and information about travel-times
between pairs of cities, find the shortest route that visits all cities. This is actually an
incredibly…

Often, we use graphs as a structured representation of some kind of information. Many interesting problems involving the things we're representing can then by described in terms of operations over
graphs. Today, I'm going to talk about some of the basic operations that we can do on graphs, and in later posts, we'll see how those operations are used to solve graph-based problems.
The easiest thing to describe is exponents of a graph. Given a graph G=(V,E), for
an integer n, Gn is a graph with the same vertices as G, and with edges between
two vertices v and w if and only if there is a path…

One application of graph theory that's particularly interesting to me
is called Ramsey theory. It's particularly interesting as someone who
debunks a lot of creationist nonsense, because Ramsey theory is in
direct opposition to some of the basic ideas used by bozos to
purportedly refute evolution. What Ramsey theory studies is when some
kind of ordered structure *must* appear, even in a pathologically
chaotic process. Ramsey theory is focused largely on *structures* that
exhibit particular properties, and those structures are usually
represented as graphs.
For example, the most common…

Today, I'm going to talk a bit about two closely related problems in graph theory: the maximal clique detection problem, and the maximal common subgraph problem. The two problems are interesting both on their own as easy-to-understand but hard-to-compute problems; and also because they have so many applications. In particular, the two are used extensively
in bioinformatics and pharmacology for the analysis of complex molecular structure.
As I mentioned back in the definitions post, a clique is a complete graph: a graph where all vertices are adjacent, where any pair of vertices has an edge…

One of the things that I find fascinating about graph theory is that it's so simple, and
yet, it's got so much depth. Even when we're dealing with the simplest form of a graph - undirected graphs with no 1-cycles, there are questions that *seem* like that should be obvious, but which we don't know the answer to.
For example, there's something called the *reconstruction theorem*. We strongly suspect that it's really a theorem, but it remains unproven. What it says is a very precise formal version of the idea that a graph is really fully defined by a canonical collection of its subgraphs.
To…