Familiar with prisoner’s dilemna? It’s a game, played between two people, with two responses available to each side, making for a total of 4 possible outcomes. The context is that of two accomplices caught by the police, and the choice is to either stay silent or confess. Outcomes are as follows (from the wikipedia page):
|
B stays silent |
B confesses |
A stays silent |
Each serves 6 months |
A serves 10 years
B goes free |
A confesses |
B serves 10 years
A goes free |
Each serves 5 years |
What should one do as A? In this case it is easy to decide, because no matter what B does, A ends up either a better result if A confesses. Thus A will confess. Symmetrically, B will also confess, and we predict the lower-right outcome.
Now consider what happens if we know we are going to play the game twice. Naively, you may think that B now has 4 possible strategies, (silent-silent, confess-silent, silent-confess, confess-confess). In reality, B has 8 possible strategies, because on the second round he would have seen A’s first round move, and thus the second round response can be conditioned on A’s first round move, which means a total of 4 ways of responding in the second round. That, combined with the first round’s 2 ways, makes for a total 8 possible strategies.
Let’s consider one of those 8 strategies, where B will first remain silent, and then imitate A’s first move for the second round. In this case, if A follows the same strategy, they will end up with silent plays the whole way, and get away with serving one year each (2 x 6 months). What if A decides to confess and then confess? In that case, the result is (confess,silent) for the first round, and (confess,confess) for the second round. A then serves 5 years time, while B serves 15 years time. Note that even though A does better than B, it does considerably worse than the 1 year in the other scenario.
I can encode all 8 possible 2-game strategies using 3 bits: (round 1 response, round 2 response if opponent remained silent, round 2 response if opponent confessed) and this would be the resulting payoff table:
|
SSS |
SSC |
SCS |
SCC |
CSS |
CSC |
CCS |
CCC |
SSS |
1.0,1.0 |
1.0,1.0 |
10.5,0.5 |
10.5,0.5 |
10.5,0.5 |
10.5,0.5 |
20.0,0.0 |
20.0,0.0 |
SSC |
1.0,1.0 |
1.0,1.0 |
10.5,0.5 |
10.5,0.5 |
10.0,10.0 |
10.0,10.0 |
15.0,5.0 |
15.0,5.0 |
SCS |
0.5,10.5 |
0.5,10.5 |
5.5,5.5 |
5.5,5.5 |
10.5,0.5 |
10.5,0.5 |
20.0,0.0 |
20.0,0.0 |
SCC |
0.5,10.5 |
0.5,10.5 |
5.5,5.5 |
5.5,5.5 |
10.0,10.0 |
10.0,10.0 |
15.0,5.0 |
15.0,5.0 |
CSS |
0.5,10.5 |
10.0,10.0 |
0.5,10.5 |
10.0,10.0 |
5.5,5.5 |
15.0,5.0 |
5.5,5.5 |
15.0,5.0 |
CSC |
0.5,10.5 |
10.0,10.0 |
0.5,10.5 |
10.0,10.0 |
5.0,15.0 |
10,10 |
5.0,15.0 |
10,10 |
CCS |
0.0,20.0 |
5.0,15.0 |
0.0,20.0 |
5.0,15.0 |
5.5,5.5 |
15.0,5.0 |
5.5,5.5 |
15.0,5.0 |
CCC |
0.0,20.0 |
5.0,15.0 |
0.0,20.0 |
5.0,15.0 |
5.0,15.0 |
10,10 |
5.0,15.0 |
10,10 |
Looking at the table, we realize that “always confess”, CCC, is still the only Nash equilibrium for the twice-iterated case. For more iterations, however, the “tit-for-tat” startegy eventually becomes another Nash equilibrium. For more reading, go here.