Vanessa Rousso Game Theory. Poker Player

You must find the perfect balance between deceiving your opponents and cooperating with them

We’ve previously discussed the ‘Prisoner’s Dilemma’ as a way of introducing some basic game theoretical insights. Recall the following matrix depicting the years in jail for two players in the form (Player 1/Player 2) for the various possible ‘prisoner’s dilemma’ game outcomes: Remember that with a decision of the type represented in Figure 1 (right), Player 1 should always confess. That is because regardless of what Player 2 does, Player 1 is better off confessing. Furthermore, since the player’s payouts mirror one another, Player 2 will come to the same realisation (that he ought to always confess) as Player 1.

This will result in the following: the attempt for each player to act in his best interest results in five years’ jail time for each player. Clearly, both players would prefer the outcome depicted by the lower right quadrant, where they each only get two years in jail. However, neither player will risk not confessing without steadfast assurances that the other player will do the same – otherwise, the player risks being left incarcerated for ten years should the other player defect from the agreement.

Cooperation nation

The above can be abstracted into a prototype model for situations where people have the incentive to act in a certain way that, if every similarly-situated person so acted, it actually makes everyone worse off. These dilemmas abound in the real world. The following two situations should exemplify:

Let us presume that one day your city announces a water shortage, and the city officials request that everyone reduce their individual water consumption to no more than one shower a day. Now what if you have had a really long day and you find yourself craving a second shower one evening? If you refrain from the shower, the beneficial effects to the public at large are imperceptible – but the negative impact to you is plainly (and odorously) evident! If you just ‘sneak in’ that second shower no one would ever know – and society at large would be fine, right? Well, not if everyone thought that way.

A similar dilemma is posed when it comes to voting. It takes a level of personal investment to get to the polls – it involves traffic queues and the loss of time. And will your one vote really make a difference anyway? Well, it would if everyone thought in the same way and allowed themselves to be deterred from participating in the political process. Indeed, our society’s well-being relies on at least some level of general voter participation.

Society has evolved over time to ‘solve’ the dilemmas inherent in situations such as those explained above. Over years and years, we have developed into a state where common and collective cooperation have become custom – and defections from these customs are relatively rare.

This is not to say that everyone cooperates; common sense dictates that this is not the case. Clearly, in all of these prisoner’s dilemma – type situations, people have the choice to either ‘cooperate’ or ‘defect’ from the unwritten yet understood pact that exists between humans when they are aware of the common good in acting a certain way.

Moreover, not everyone acts the same all the time – that is, just because someone chooses to ‘defect’ at one point does not necessarily mean that in future situations they will refuse to cooperate. Rather, society is dynamic and in constant flux, in search of what is termed a ‘long- term stable equilibrium’ by game theorists.

A balancing act

Natural evolution provides a very effective framework through which to view cooperation/defection flux in prisoner’s dilemma communities. As most of society cooperates, it becomes profitable for a single member to defect and take advantage of the cooperation of the others in his community. Over time, this defector will perform better – other things equal – than his counterparts. Therefore, this trait will be selected for in evolutionary terms and as time goes by – and reproduction takes place – it will become more and more predominant within the society.

This continues and defections will spread until the common state of affairs is negatively impacted; at this point, cooperation becomes the desirable evolutionary trait – spreading and re-gaining in prevalence within the community. Figure 2 illustrates this dynamic process with blue cells representing cooperation and red cells depicting defectors.

This process is circular and will take place over time again and again until a state of equilibrium is reached (if possible). Some communities remain in constant fluctuation around a state of equilibrium, but are too dynamic to ever actually reach that state. Figure 3 shows us what equilibrium might look like in one of these communities. This can be viewed as both a snapshot of an equilibrium community at large at any random instant and as a mapping of the various strategies that an individually-optimising (or selfish) member will employ over time within the community.

How does this relate to poker?

In the abstract, poker presents a prisoner’s dilemma of sorts. In order for players to influence their opponents they have to have at least some level of exhibited rationality and reliability; they can’t be effective if their actions are totally random. But this presents a paradox: if a player’s actions are credible then their cards might as well be face up, as their play will be completely predictable. Surely predictability cannot be profitable as an absolute strategy?

In evolutionary terms, a community of totally co- operative poker players would soon be overtaken by defectors who are able to benefit through deception. On the flip side, a community consisting wholly of deceptive poker players will also be unable to survive very long as play would be utterly meaningless – and easily exploitable to an invading stream of rational players who are able to capitalise on the mistakes made by these irrational players.

Ultimately, players are constantly struggling between a desire to deceive their opponents and the need for their opponents to trust that they are acting in accordance with their actual hand. Thus, the poker community exists – like the world at large in our examples – in a state of constant fluctuation around an equilibrium state where cooperation and defection are able to co-exist.

This fluctuating nature of the poker community is highlighted by the complex conversations taking place at the tables, all occurring without the need for players to utter a single word. Some players are saying: ‘See, you can trust me, I always act rationally.’

Others assure: ‘I trust your actions – you have shown your ability to be rational so I will give you credit for your bet.’ Still others defend: ‘I have just deceived you with a bluff – but you understand I have to deceive in order to survive, so you can trust me next time around because you know that I mostly cooperate.’

And others assert: ‘I couldn’t trust you this time around, as I can’t give you credit for being rational any more – I think you are getting too greedy.’ Playing poker becomes an intricate and collective tap dance between players alternating between predictability and deceit.

Mix it up and start again

As players must reconcile the duelling needs of cooperative and deceitful play, optimal strategies are rarely absolute, for obvious reasons. If I always zig when you zag, I become predictable. And in a game of incomplete information such as poker, predictability is a definite downfall. When your play becomes predictable, you allow your opponents to eliminate risk in devising their optimal plays when responding to your informational cues.

So, in poker and other games alike, optimal strategies are often what game theorists refer to as ‘mixed strategies’. These mixed strategies refer to the need for each player to mix in cooperative and defective styles of play.

Indeed, the question of whether a particular situation warrants cooperation or defection is at the heart of how to play poker optimally. Next month I will explain mixed strategies in depth so that you’re equipped to grapple the issue of when to alter your mix of strategies.