Understanding the relationship between game theory, jail time and poker will make you a better player

Suppose the police have arrested two people whom they know have committed an armed robbery together. Unfortunately, they lack enough admissible evidence to get a jury to convict. They do, however, have enough evidence to send each prisoner away for two years for theft of the getaway car.

The chief inspector now makes the following offer to each prisoner (held in isolation from each other): ‘If you will confess to the robbery, implicating your partner, and he does not also confess, then you’ll go free and he’ll get ten years. If you both confess, you’ll each get five years. If neither of you confess, you’ll each get two years for the auto theft.’

The question is: what is the best ‘play’ for each of the prisoners? Where does this leave them? This problem is a great place to begin the process of breaking complex situations of human interaction down into manageable maths-based models. At first glance you might be overwhelmed at trying to figure this problem out. And that’s fine. This is because your initial confusion is not a result of the actual complexity (or lack thereof as you will soon see) of this problem.

Rather, you are probably doing what everyone does unless they know otherwise: you are using what I call the ‘brute force’ method. This refers to the stubborn, trial-and- error method of problem solving. You try to 1) grasp all of the potential solutions, then 2) recall their results once you are done, so that you can 3) compare the results and choose the best answer. This method is inefficient – and while it is manageable for this sort of two-person, two- option problem, once you add players and variables to the game this method becomes increasingly frustrating.

### Get your game on

So, we turn to the philosophical underpinnings of game theory (GT): that every complex situation can be broken down into a framework that is manageable and simple. The goal is then to analyse this framework to determine the best play (or ‘optimal solution’ in GT terms).

There are many types of games, and hence many types of GT frameworks that might apply. This game is a 2×2 (two-player, two-option) simultaneously-played (as opposed to sequentially-played) game of perfect information (where each player knows all the options and payoffs to the other player when he has to act). These types of games are most easily analysed using a matrix, as shown in figure 2, right.

As you can see, your options are now represented on a matrix where each cell represents an outcome based on the combined results of the potential decisions that are available to each player. The numbers inside each cell represent the ‘payouts’ for each player: (Player 1/Player 2). The payouts in this problem are ‘years in jail’ – and the object of the game for each player is therefore to minimise his/her payout (years in jail). For purposes of analysis, I will refer to the upper left quadrant as A, the upper right quadrant as B, the lower right quadrant as C, and the lower left as D, as shown in figure 1.

Clearly, Player 1 would prefer to end up in Quadrant D where he gets zero years in jail. Unfortunately (for both players it turns out), Player 2 would most prefer to end up in Quadrant B where he gets zero years in jail.

Now, let us continue by analysing the options for Player 1. Begin by assuming that Player 1 knows Player 2 will confess. In that case, Player 1’s decision matrix is much simpler, as shown in figure 3.

In this situation, Player 1 should confess so that he only gets five years in jail as opposed to ten. Then, let’s analyse the other scenario – say, instead, Player 2 would certainly NOT confess. Now, Player 1 faces the decision matrix shown in figure 4.

If this were the case, Player 1 should still confess: he would end up getting off without jail time as opposed to spending two years in the slammer!

Now, once you combine the analyses above, you realise that Player 1 should ALWAYS confess, as regardless of what Player 2 does, Player 1 is better off confessing.

Don’t forget that the other player is in the same boat! So, he will eventually realise his best play is to confess as well. This will result in the outcome shown in figure 5.

As you can see, the players will end up in Quadrant A with five years of jail time apiece – not exactly either player’s initial optimal scenario. Now here comes the interesting part: both are worse off at ‘A’ than they would be at ‘C’ (with two years in jail apiece), but neither will take the risk (ten years’ jail time) to end up there.

This problem (referred to as the ‘Prisoner’s Dilemma’) is commonly cited in GT literature to demonstrate the inherent problems of non-cooperative problem solving. Often, when players refuse to engage in cooperation, they end up far worse off than they would be otherwise.

However, cooperation requires trust. In this case, in order to get to Quadrant C, both players would have to trust that the other player would refuse to confess. This is hard to do when you know that the other player could easily defect and lie to you so that you don’t confess, when all along he merely intends to confess and get off without jail time (leaving you behind bars for ten years!).

### Back to the real world

So, what does this have to do with poker? To begin with, this is an easy way to become familiar with creating frameworks and matrices to analyse complex problems. We will build on this foundation significantly in the coming months in order to most accurately analyse the multifaceted game of poker. Nonetheless, we are left with at least the following conclusions for now:

1.In order to make optimal decisions, a player must take into account the possible actions of his opponent and systematically determine his best response – all the while recalling that his opponent is doing the same!

**2.Competition fosters situations which often result in all players being worse off in the end (out of their rational attempts to act optimally) than they would be if they had cooperated – but cooperation requires trust (which is a rare commodity in games where players have a lot at stake).**

Moreover, I will leave you with poker’s very own prisoner’s dilemma: whether or not to ‘check it down’ with an opponent when a third opponent is all-in and you are playing a tournament in which every single elimination means more prize money for the players that remain. Here is the dilemma: generally, in this situation there is a pretty big pot and therefore a lot to gain by betting and trying to push the other opponent out of the pot (so that you only have to best the one opponent who is already all-in to win).

However, there is even more to gain if you can eliminate the all-in player – something that is more likely to happen if you do not bet to push the other player out of the hand. There is nothing worse than betting and pushing a player out of a pot only to find that you lose to the all-in player anyhow – and that the player you ousted would have eliminated this opponent! Remember, if the player you pushed out won the pot you would still have lost to another player, but with the all-in player’s elimination you would be one step closer to winning the tournament.

However, it is difficult (and requires trust) to ensure that the other player will cooperate and ‘check it down’ to the river so as to maximise the odds of eliminating the all-in player. Without some sort of assurance, you run the risk of losing out in the long run as you get pushed out of these types of pots and fail to even the score by doing the same every so often yourself.

Interestingly, poker players have evolved over time to have an unspoken understanding that when a third player is ‘all-in’, the remaining players will check it down unless they have a very strong hand that’s almost certainly going to beat the all-in player’s hand, and for which they will understandably try to get additional value from the other opponent. (See page 52 for more on this dilemma.)

Well, I hope this lesson has whetted your appetite for more game theory because this is only the very tip of the proverbial iceberg. For now, practise this way of thinking while you are at the table:

- Begin to think about where you want to end up in each ‘game’ you play. Note, however, that a ‘game’ can be as broad as a tournament on the whole or as specific as a single hand that you play.

- Start the process of reducing your choices into manageable and concrete options: check, call, raise, be aggressive, be tight, be loose, be passive, show strength, show weakness, and so on.

- Then, envision each option as adding a column to your decision ‘matrix’ like the one earlier.

- If you get comfortable doing that, start to think about the other player(s) at the table.

- You must assume that everyone is rational and is attempting to maximise his chip count (or payoff).

- Then, analyse the options available to your opponents in particular situations.

Oh – one last thing. The smartest of the bunch in our problem above was the cop! He masterminded a scenario to ensure he’d get the most combined jail time for the prisoners. Now, I ask you: in poker, who are the ‘cops’?