Using Graphs to Represent Information: Lattices and Semi-Lattices

By goodmath on September 18, 2007.

There's a kind of graph which is very commonly used by people like me for analysis applications, called a lattice. A lattice is a graph with special properties that make it
extremely useful for representing information in an analysis system.

I've mentioned before that you can use a graph G=(V,E) to describe a partial order over its vertices, where given two vertices a,b∈V, a<b if and only if either (a,b) ∈E, or there exists
some vertex c∈V such that a<c and c<b. If this is true, G is called a partial order graph (POG).

If we look at POGs, we can create a kind of completion property for describing certain POGs, without
causing them to become total orders, called a greatest lower bound property. Suppose we've got two
vertices a and b, where a is not less that b, and b is not less than a. In a graph with
the GLB property, there is some set of vertices which are the largest vertices according to the
graph partial order, and which are less than both a and b. This set, called the greatest lower bound set (GLB set) of a and b will consist of vertices that are not ordered relative to each other in the graph partial order. For example, if you look at the graph to the right, you can J and C are both GLBs of A and B - but J is not less that C, and C is not less than J. They're not comparable. But they, in turn, have their own GLB - D is the GLB of C and J.

Given a POG P, if for any any pair of vertices a and b in P, there is exactly one member in GLB(a,b), then the graph is called a semi-lattice. The example graph above is not a semi-lattice: the GLB(A,B) contains two elements. The graph to the right is a semi-lattice.

There's a symmetric notion - the least upper bound. You can guess it's definition. If a graph
has both a LUB and a GLB, then it's called a lattice.

Why are lattices so important?

A lot of the work that I've done has been on program analysis of one kind or another. Program
analysis is pretty typical of how lattices are used. When you're doing an analysis of some kind on a program, at some location in the program, you have some knowledge about the properties of the program
at that point. Then you reach a conditional. At the end of one branch of the conditional, what
you know can be represented as some value T. At the end of the other branch of the conditional, what
you know can be represented as some value F. What do you know after the two branches join, at the first statement after the conditional? What you know is the greatest lower bound of T and F.

Suppose that you know that statement A and statement B are completely independent. After A, you know some set of facts F_A; after statement B, you know some set of facts F_B. What
do you know after both A and B? You know the least upper bound of F_A and
F_B.

For a program analysis to be sound and conservative - which is a fancy way of saying "absolutely sure that its results are correct" - the set of values that can be produced by the analysis must, at least, be a semi-lattice.

To give you a better sense, let's look at an example.

def foo(a,b):
   x = 13
   y = x*2
   z = x * a
   if (b < z):
      z = y*z
   else:
      x = sqrt(y)
   q = x/11      
   return z

Suppose we want to see what we can replace with constants in the code above. After the first
statement, X is a constant: {Const(x,13)}. After the second statement, we can say that y is a constant,
too - it's a constant times a constant. {Const(x,13), Const(y,26)}. After the third statement, we're x and
y are still constant; z isn't: {Const(x,13), Const(y,26), Var(z)}.

Now we start the conditional. In the true branch, we end up with: {Const(x,13), Const(y,26×26),
Var(z)}. In the false branch, we wind up with {Const(x,sqrt(26)), Const(y,26), Var(z)}.

After the conditional, what do we have? It's not just something like set union, or set intersection,
or even set difference. It's something specific to this kind of information - the lower bound of the two
branches. We lose X as a constant - because it could be one of two different constants, based on the value
of the parameter b, which we don't know. What we end up with is {Var(x),Const(y,26),Var(z)} - the lower
bound of the two branches in the semi-lattice.

This is part of a common, general pattern in all sorts of analysis, from programs, to planning, to
knowledge representation. When we combine two sets of information, we can either know that both are true
(in which case we combine them using the upper bound), or we can know that one or the other is
true (in which case we combine them using the lower bound). By making sure that the values form
a lattice, and using the lower and upper bounds in the lattice to combine information, we know that our results are safe in the sense that we'll never claim to know something that we don't.

What that means for program analysis, in practical terms, is that we can perform optimizations - make changes to the program to make it more efficient - based on information from the analysis: if the analysis says that the optimization won't change the results of the program, then we can be absolutely sure that it won't. We call that being conservative: we only do things that we can be absolutely sure
won't change the meaning of the program. That's obviously a pretty critical thing in a compiler. Proving that the set of values used by the analysis form a lattice is by far the easiest way of showing that the analysis is conservative.

More like this

Shortest Paths and Greedy Relaxation

Another really fun and fundamental weighted graph problem is the shortest path problem. The question in: given a connected weighted graph G, what is the shortest path between two vertices v and w in G? To be precise, the weight of a path is just the sum of the weight of the edges that make up the…

Graph Contraction and Minors

Another useful concept in simple graph theory is *contraction* and its result, *minors* of graphs. The idea is that there are several ways of simplifying a graph in order to study its properties: cutting edges, removing vertices, and decomposing a graph are all methods we've seen before.…

What I Do When I'm Not Doing This Blog

From the department of self-promotion, let me call attention to the current volume of The Electronic Journal of Combinatorics. If you click on the link and scroll down to P164, you will find a barn-burning, rhetorical masterpiece of a paper entitled, “Isoperimetric Numbers of Regular Graphs of…

The Kripke Model for Intuitionistic Logic

As promised, today, I'm going to show the Kripke semantics model for intuitionistic logic. Remember from yesterday that a Kripke model has three parts: (W, R, :-), where W is a set of worlds; R is a transition relation between worlds; and ":-" is a forces relation that defines what's true in a…

That is extremely interesting.

Why are lattices so important?

Because they're categories with (binary) products and coproducts, of course :D

One might add a piece of handy notation for lattices - in analogy with the notation of set theory: Aâ§B for GLB(A,B) and Aâ¨B for LUB(A,B). These are associative operators. (Exercise for the reader: Is either distributive over the other, as is the case in set theory?)

Cool. Now, how do you handle loops?

Jon L:

Loops are handled using fixed point operators. You can model the effect of a loop as a delta against the information state at the entry to the loop. A lattice guarantees that applying
that delta repeatedly will reach a fixed point. The fixed point is the state at loop exit.

Of course, this also applies to project dependency graphs as well, which can be similarly optimized to minimize the overall dependency graph or determine whether a particular dependency-inducing change will cause bad stuff to happen.

Alex:

Yes, dependency graphs - both program dependency graph representations, and build tool dependency graphs can be arranged in a lattice, and lots of systems rely on that fact.

A006966 Number of lattices on n unlabeled nodes.

http://www.research.att.com/~njas/sequences/A006966

COMMENT

Also commutative idempotent monoids. Also commutative idempotent semigroups of order n-1.

n a(n)
01
11
21
31
42
55
615
753
8222
91078
105994
1137622
12262776
132018305
1416873364
15152233518
161471613387
1715150569446
18165269824761

What if I said this seemed like some pretty interesting stuff, and that I wanted to learn more; where would you point me? I guess I could Google: lattice graphs, partial order graphs, etc., but do you have a specific recommendation? Thanks for the really interesting post.

Clint:

I don't have a specific recommendation. My main source is class notes from a seminar course on static analysis for compilers given by my PhD advisor. I don't have any book
that's specifically on lattice theory.

Any good text on graph theory should have some lattice theory in it. Knowledge representation texts would also likely have some interesting material on the topic.

There's a really nice example of a lattice, that is pretty partly because of its ubiquity and because it turns numbers into pretty pictures. Basically if you take any integer and form a poset consisting of all the numbers that divide it. And use the divisibility relation to form the ordering, then draw a hasse diagram of the poset, Voila you get a lattice with the original number at the top and 1 at the bottom. Each path from bottom to top represents a different way to factor the number. There's a picture on wikipedia (because I think my explantion kind of sucks).

An appreciative reader, here. Seems like if you reverse the arrows in the first figure, then the GLB set becomes what I've come to know as the dominance frontier. Is that so?

Advertisment

Donate

ScienceBlogs is where scientists communicate directly with the public. We are part of Science 2.0, a science education nonprofit operating under Section 501(c)(3) of the Internal Revenue Code. Please make a tax-deductible donation if you value independent science communication, collaboration, participation, and open access.

You can also shop using Amazon Smile and though you pay nothing more we get a tiny something.

Science 2.0

Science Codex

More by this author

Moving on

August 2, 2010

Finally, at long last, I can tell you what I've been up to with finding a new home for this blog. I've created a new, community-based science blogging site, called Scientopia. With the help of many wonderful people, we're ready. We launched this morning. So to continue following GM/BM - along with…

Goodbye, Scienceblogs

July 7, 2010

So my decision is made. I'm closing up around here. I'm in the process of working out exactly where I'm going to go. With any luck, Seed will leave this blog here long enough for me to post an update with the new location. But I'm through with Seed and ScienceBlogs.

Seed, Conflicts of Interest, and Sleaze

July 6, 2010

As my friend Pal wrote about, Seed Media Group, the corporate overlords of the ScienceBlogs network that this blog belongs to, have apparently decided that blog space in these parts is now up for sale to advertisers. We've been advertiser supported since I joined up with SB. I've never minded…

Searching for Topics

June 28, 2010

As regular readers have no doubt noticed by now, posting on the blog has been slow lately. I've been trying to come back up to speed, but so far, that's been mainly in the form of bad math posts. I'd like to get back to the good stuff. Unfortunately, the chaos theory stuff that I was…

Saturday Recipe: Ginger Scallion Sauce

June 26, 2010

Today's recipe is something I made this week for the first time, and trying it was like a revelation. It's simple to make, it's got an absolutely spectacularly wonderful flavor - light and fresh - and it's incredibly versatile. It's damned near perfect. It's scallion ginger sauce, and once you try…