Iterated Zero-Sum Games

By goodmath on May 5, 2008.

The games that we've looked at so far are the simplest case of
basic games. In these games, we've got a payoff matrix, and both players
can see the whole matrix - the players have equal information,
and nothing is secret. The players move simultaneously - so neither player can wait to see what his opponent does before making his own decision. Finally, the game is played exactly once - there are no repetitions.

The first complication we can add is to make it an iterative game - that is, instead of each player going once, they'll repeat the game 10 times in sequence. Everything else stays the same: perfect, equal information, simultaneous moves, etc.

This creates an interesting added dimension. Now, we've got two layers
of strategy: each iteration, each player selects a strategy; and then there's an over-arching strategy that they use to select their strategy each iteration. That over-arching strategy is called a grand strategy

In an iterated game, the optimal solution for players can be different than in the single iteration. The goal is to maximize your total utility at the end of a series of iterations. It's OK if some iterations result in a loss of utility, so long as by the end of the series of iterations, the final
sum of the utility of the series is maximal.

For example, look at the following game. (This example is copied
from the textbook The Compleat Strategyst (Complete Strategist): Being A Primer On The Theory Of Games Of Strategy.)

	A	B
1	3	6
2	5	4

Player 1 needs to choose either 1 or 2; player 2 needs to choose A or B. For simplicity, this is set up so that player 2 always pays player 1, so the utility values in the game grid are written as payoffs to player 1. (Originally, I didn't include an explanation of the game grid payoffs. This confused a lot of readers. Sorry!)

There's no saddle point here. The maximin for player 1 is 4 (at 2,B); the
minimax for player 2 is 3 (at 1,A). So the simple solution formula from before
doesn't work. If we're only playing the game once then there's no way for
either player to create an idea outcome for themselves. Look at it from the
perspective of player 1. If player 2 picks strategy A, then the best thing for
player 2 to do is to pick strategy 2. But if player 2 picks B, then player 1
is better off picking A. Since the players move simultaneously, player
1 cannot pick the best strategy.

If the game is played once, then there's no way to play optimally. It's
basically random. But if you repeat the game multiple times - that
is, you play it as an iterated sequence of games - then you can find an
optimal strategy for selecting strategies. That's called a grand
strategy: in a game where you make multiple moves, each move is a
strategy, and the process of selecting strategies for each move is a
grand strategy.

There is a successful grand strategy in this kind of iterated
zero-sum game. What's interesting about it is that our idea of
strategy is not really what works as a successful grand strategy: the successful grand strategy is random.

After all: if player 1 can figure out player 2's grand strategy,
then he can easily pick an optimal strategy. So the best grand strategy
is one that the other player can't figure out. If you're playing randomly, then there's no way that the other player can predict
what strategy you choose.

So, let's look at an example of a series of iterations, with player 1 selecting randomly, with a uniform probability distribution, and player
2 playing minimax:

Iteration one: Player one selects 1, player two selects A. Player one takes 3.
Iteration two: Player one selects 2, player two selects A. Player one
wins 5.

It will continue like that. Player one will win 3 half the time, and 5
the other half - for an average of 4 - which is his maximin value. So
playing this way is at least as good for player 1 as maximin.

This is a pretty trivial game: for player 2, he's guaranteed to
lose, and the only question is how quickly he's going to lose. And
there's not any strategy that's going to really improve things much for player 2. There's not much that 2 can do: playing strategy B is pretty much a lose for
him. So why would he ever do it?

If player 1 knows that player 2 is always going to play A, then he
could play strategy 2, and always win 5. So player two is motivated to
play B just often enough to make player one try 1 once in a while. The
key is, how often should he do that? What's the best distribution for him,
to minimize his losses?

As you can see from the example, the key to finding the optimal strategy is
probability. You need to assign a probability distribution to the different
strategies. Each iteration, you pick a strategy randomly, according to the
distribution. If you can find the right probability distribution for your grand strategy, then you can optimize your winnings.

But is there always an optimal distribution? And how can you find it?

There is always an optimum. John vonNeumann (who managed to have his
fingers in an astonishing number of pies) proved that for every two player,
simultaneous move zero-sum iterated game, there is an optimal grand strategy based
on a probability distribution. And it's computable. It took a while after
JvN to figure out the fast way of doing it - but it's doable, and it's doable
quickly, through a technique called linear programming.

Linear programming is a fascinating topic in itself. And it's complicated
enough that it deserves a post of its own. (Or, more likely, several.)

More like this

Hoops Hypothesis

Hypothesis: The outcome of any pick-up basketball game depends more strongly on the match-up between the two worst players on each team than the match-up between the two best players on each team.

Zero Sum Games

In game theory, perhaps the most important category of simple games is something called zero sum games. It's also one of those mathematical things that are widely abused by the clueless - you constantly hear

Computers Are People Too, And Must Be Punished For Their Indiscretions

For several years, researchers have been contrasting human-human and human-computer interactions in order go gain more insight into theory of mind. The assumption is that people don't treat computers like, well, people. It's not a totally unfounded assumption, either.

Fat Footballers Face Fleeting Future

Folks have been enthusiastically commenting around the clock on the possiblity of using prisoners in clinical trials.

Very interesting. I didn't know the bit about an optimum always being computable. Hopefully we'll get a chance to hear more about that!

I was a little confused at the start about the table used, since it has 1 and 2 as names for both strategies and players. Maybe use X and Y for the strategies, to mimic A and B?

I think I forgot to read "Player 1 needs to choose either 1 or 2; player 2 needs to choose A or B." That said, there are likely other readers who will do the same and decide to stop reading when they get stalled.

Maybe the payoff matrix is rendered wrong for me or something as the whole thing is incredibly compressed and hard to read (using Firefox 3), but I see only a single payoff value for each entry, not a pair, so I really don't understand the explanation here (the previous game theory post suffers from the same thing). I guess the values I see are for player 1 but I don't really know.

@Janne: This is a zero sum game, so the number given is the payoff Player 1 recieves, and thus the sum Player 2 loses.

@MCC: I would state this explicitly, as I got confused at the same point, and suspect many others will.

OK, thanks; there was no info on it being a zero-sum game, and as the matrix is rendered so compressed I wasn't sure what I was looking at.

If you have no idea about linear programming, couldn't you just code the game, and have the computer try various probabilities and see which work? Computers are pretty fast. It might be quicker than actually knowing what's going on.

Advertisment

Donate

ScienceBlogs is where scientists communicate directly with the public. We are part of Science 2.0, a science education nonprofit operating under Section 501(c)(3) of the Internal Revenue Code. Please make a tax-deductible donation if you value independent science communication, collaboration, participation, and open access.

You can also shop using Amazon Smile and though you pay nothing more we get a tiny something.

Science 2.0

Science Codex

More by this author

Moving on

August 2, 2010

Finally, at long last, I can tell you what I've been up to with finding a new home for this blog. I've created a new, community-based science blogging site, called Scientopia. With the help of many wonderful people, we're ready. We launched this morning. So to continue following GM/BM - along with…

Goodbye, Scienceblogs

July 7, 2010

So my decision is made. I'm closing up around here. I'm in the process of working out exactly where I'm going to go. With any luck, Seed will leave this blog here long enough for me to post an update with the new location. But I'm through with Seed and ScienceBlogs.

Seed, Conflicts of Interest, and Sleaze

July 6, 2010

As my friend Pal wrote about, Seed Media Group, the corporate overlords of the ScienceBlogs network that this blog belongs to, have apparently decided that blog space in these parts is now up for sale to advertisers. We've been advertiser supported since I joined up with SB. I've never minded…

Searching for Topics

June 28, 2010

As regular readers have no doubt noticed by now, posting on the blog has been slow lately. I've been trying to come back up to speed, but so far, that's been mainly in the form of bad math posts. I'd like to get back to the good stuff. Unfortunately, the chaos theory stuff that I was…

Saturday Recipe: Ginger Scallion Sauce

June 26, 2010

Today's recipe is something I made this week for the first time, and trying it was like a revelation. It's simple to make, it's got an absolutely spectacularly wonderful flavor - light and fresh - and it's incredibly versatile. It's damned near perfect. It's scallion ginger sauce, and once you try…