Seed Media Group

Search this blog

Profile

markcc.jpg
Mark Chu-Carroll (aka MarkCC) is a PhD Computer Scientist, who works for Google as a Software Engineer. My professional interests center on programming languages and tools, and how to improve the languages and tools that are used for building complex software systems.

Other Information

Add this blog to my Technorati Favorites!

Recent Posts

Recent Comments

Categories

Blogroll

Old Topic Indices

Great Online Books

« Suited Assholes on the Subway | Main | Selective Data and Global Warming »

Iterated Zero-Sum Games

Category: goodmath > Game Theory
Posted on: May 5, 2008 10:09 AM, by Mark C. Chu-Carroll

The games that we've looked at so far are the simplest case of basic games. In these games, we've got a payoff matrix, and both players can see the whole matrix - the players have equal information, and nothing is secret. The players move simultaneously - so neither player can wait to see what his opponent does before making his own decision. Finally, the game is played exactly once - there are no repetitions.

The first complication we can add is to make it an iterative game - that is, instead of each player going once, they'll repeat the game 10 times in sequence. Everything else stays the same: perfect, equal information, simultaneous moves, etc.

This creates an interesting added dimension. Now, we've got two layers of strategy: each iteration, each player selects a strategy; and then there's an over-arching strategy that they use to select their strategy each iteration. That over-arching strategy is called a grand strategy

In an iterated game, the optimal solution for players can be different than in the single iteration. The goal is to maximize your total utility at the end of a series of iterations. It's OK if some iterations result in a loss of utility, so long as by the end of the series of iterations, the final sum of the utility of the series is maximal.

For example, look at the following game. (This example is copied from the textbook The Compleat Strategyst (Complete Strategist): Being A Primer On The Theory Of Games Of Strategy.)

AB
136
254

Player 1 needs to choose either 1 or 2; player 2 needs to choose A or B. For simplicity, this is set up so that player 2 always pays player 1, so the utility values in the game grid are written as payoffs to player 1. (Originally, I didn't include an explanation of the game grid payoffs. This confused a lot of readers. Sorry!)

There's no saddle point here. The maximin for player 1 is 4 (at 2,B); the minimax for player 2 is 3 (at 1,A). So the simple solution formula from before doesn't work. If we're only playing the game once then there's no way for either player to create an idea outcome for themselves. Look at it from the perspective of player 1. If player 2 picks strategy A, then the best thing for player 2 to do is to pick strategy 2. But if player 2 picks B, then player 1 is better off picking A. Since the players move simultaneously, player 1 cannot pick the best strategy.

If the game is played once, then there's no way to play optimally. It's basically random. But if you repeat the game multiple times - that is, you play it as an iterated sequence of games - then you can find an optimal strategy for selecting strategies. That's called a grand strategy: in a game where you make multiple moves, each move is a strategy, and the process of selecting strategies for each move is a grand strategy.

There is a successful grand strategy in this kind of iterated zero-sum game. What's interesting about it is that our idea of strategy is not really what works as a successful grand strategy: the successful grand strategy is random.

After all: if player 1 can figure out player 2's grand strategy, then he can easily pick an optimal strategy. So the best grand strategy is one that the other player can't figure out. If you're playing randomly, then there's no way that the other player can predict what strategy you choose.

So, let's look at an example of a series of iterations, with player 1 selecting randomly, with a uniform probability distribution, and player 2 playing minimax:

  • Iteration one: Player one selects 1, player two selects A. Player one takes 3.
  • Iteration two: Player one selects 2, player two selects A. Player one wins 5.

It will continue like that. Player one will win 3 half the time, and 5 the other half - for an average of 4 - which is his maximin value. So playing this way is at least as good for player 1 as maximin.

This is a pretty trivial game: for player 2, he's guaranteed to lose, and the only question is how quickly he's going to lose. And there's not any strategy that's going to really improve things much for player 2. There's not much that 2 can do: playing strategy B is pretty much a lose for him. So why would he ever do it?

If player 1 knows that player 2 is always going to play A, then he could play strategy 2, and always win 5. So player two is motivated to play B just often enough to make player one try 1 once in a while. The key is, how often should he do that? What's the best distribution for him, to minimize his losses?

As you can see from the example, the key to finding the optimal strategy is probability. You need to assign a probability distribution to the different strategies. Each iteration, you pick a strategy randomly, according to the distribution. If you can find the right probability distribution for your grand strategy, then you can optimize your winnings.

But is there always an optimal distribution? And how can you find it?

There is always an optimum. John vonNeumann (who managed to have his fingers in an astonishing number of pies) proved that for every two player, simultaneous move zero-sum iterated game, there is an optimal grand strategy based on a probability distribution. And it's computable. It took a while after JvN to figure out the fast way of doing it - but it's doable, and it's doable quickly, through a technique called linear programming.

Linear programming is a fascinating topic in itself. And it's complicated enough that it deserves a post of its own. (Or, more likely, several.)

Comments

#1

Very interesting. I didn't know the bit about an optimum always being computable. Hopefully we'll get a chance to hear more about that!

I was a little confused at the start about the table used, since it has 1 and 2 as names for both strategies and players. Maybe use X and Y for the strategies, to mimic A and B?

I think I forgot to read "Player 1 needs to choose either 1 or 2; player 2 needs to choose A or B." That said, there are likely other readers who will do the same and decide to stop reading when they get stalled.

Posted by: pg | May 5, 2008 6:31 PM

#2

Maybe the payoff matrix is rendered wrong for me or something as the whole thing is incredibly compressed and hard to read (using Firefox 3), but I see only a single payoff value for each entry, not a pair, so I really don't understand the explanation here (the previous game theory post suffers from the same thing). I guess the values I see are for player 1 but I don't really know.

Posted by: Janne | May 5, 2008 8:34 PM

#3

@Janne: This is a zero sum game, so the number given is the payoff Player 1 recieves, and thus the sum Player 2 loses.

@MCC: I would state this explicitly, as I got confused at the same point, and suspect many others will.

Posted by: alan | May 6, 2008 6:41 AM

#4

OK, thanks; there was no info on it being a zero-sum game, and as the matrix is rendered so compressed I wasn't sure what I was looking at.

Posted by: Janne | May 7, 2008 9:57 PM

Post a Comment

(Email is required for authentication purposes only. Comments are moderated for spam, your comment may not appear immediately. Thanks for waiting.)





Having problems commenting? (UPDATED)

Search All Blogs

Blogs in the Network

Top Five: Readers' Picks

Top Science Stories

powered by SEED - seedmagazine.com