The Theory of Everything

The Theory of Everything

Friday, 22 May 2015

Dara O'Kearney looks at game thoery.

A few years ago at the WSOP in Vegas, I ran into Neil Channing. Neil occasionally gets some bad press about being, if not a grumpy old man, then a moany one, but truthfully there are few people I enjoy running into more. Neil's always good for a natter and a yarn, and on this occasion he told me about another English player who had a bit of a side hustle. This involved taking a cash chip, putting it behind his back, and wagering that the mark couldn't guess the correct hand. Once the mark had accepted the wager, he would bring both his hands to the front for the mark to guess left or right. The twist was in the presentation: he would thrust one arm forward so the closed hand was literally right under the nose of the mark, as if daring him to pick it. The other hand was kept well back.

The trick was that the chip was always in the hand held back, rather than the hand under the nose of the mark. 9 out of 10 people guessed wrong. Presumably their thought process ran to "It seems too obvious that he'd pretend the chip is not in the hand right under my nose, so he must put it there thinking I'd guess the other hand", maybe with a side of "If it is in the hand that's right under my nose and I choose the other one, I'll look really stupid".

The most interesting part of the story (for me, at least, as a maths nerd) was that the one guy who guessed right did so not because he outleveled his opponent, but because he used game theory. Game theory sounds like some sort of theory of games, but is actually the area of math that covers strategic decision making (the name derives from the fact that games are the most obvious source of real life examples of such strategic decisions). One of the goals of game theory is to "solve" any game, in the form of optimal strategy that means that no matter what the opponent does they cannot beat you long term, and unless they are also following optimal strategy, they will lose to you long term.

This optimal strategy solution is referred to as "Nash equilibrium". The Nash part derives from mathematician John Forbes Nash (he of A Beautiful Mind fame), who first proposed the concept. The equilibrium part refers to the fact that when two opponents have converged to playing optimally against each other, they have reached an equilibrium such that neither can lose to the other unless the other player diverges from the (Nash) equilibrium strategy. Like two marbles at the top of a hill, there they will remain until one moves to the left or right, and slides down the hill.

Adhering to a Nash equilibrium solution in a game is referred to an unexploitable play, because there is nothing the opponent can do to exploit your strategy. It does not provide a guarantee of winning no matter what; merely a guarantee that you can't lose. Like an Italian football team, the aim here is to draw at worst.

A can't lose strategy? Great, I hear you say, where do I sign up? But wait, there's a catch (actually two catches). The Nash equilibrium concept works better for simple games like rock paper scissors or "guess the hand" than it does for games as complex as poker.

The first problem is that - even if a Nash equilibrium exists for more complex games - we may never know it. In poker, the Nash equilibrium has been solved for a number of very specific very simple situations, such as if I'm so short stacked that my only options are shove or fold pre flop, which hands should I shove? Nash equilibria can also be computed relatively easy for 20 big blind or less situations to determine the correct hands to fold, raise fold, raise call, and reshove. But that's about as far as it goes for No Limit Hold'em. In the limit arena, it took a team of artificial intelligence experts in Alberta over a decade and several hundred powerful computers working in tandem to compute an equilibrium for heads-up Limit Hold'em. They have gone on record saying they deliberately stuck to heads-up Limit because they believe that once the number of players exceeds two, it becomes impossible to compute a Nash equilibrium (one may not even exist), and once you move from Limit to No Limit and have to allow for multiple bet sizes, even heads-up becomes too difficult to solve.

The second problem is that even where we can find a Nash equilibrium, it's all well and good in a zero-sum game, but poker is rarely a zero-sum game. There are rake and registration fees to be paid, and mouths to feed. If we all stuck to unexploitable Nash equilibria, the only winner long term would be the house (in the form of casinos, live tournament organisers and online sites).

Let's look at a concrete example of this: Player A and Player B get to the river, there's 1000 in the pot, and Player A moves all in for 1000. Player B has to call 1000 to win a pot of 2000, so if he can beat a bluff, he should call if he believes Player A is bluffing more than half the time, but he should fold if Player A bluffs less often than half of the time. Let's say he has no idea how often Player A bluffs. In this case, he reverts to game theory, and calls exactly half of the time when he can beat a bluff. Assuming Player A is also sticking to Nash equilibrium, he will be bluffing exactly half the time.

What if one of them diverges from Nash equilibrium? Let's say Player A decides never to bluff. So now he only (value) bets, he always has it. Since he's never bluffing, neither hE nor his opponent profit from bluffs.

Let's go the other way and say Player A decides to always bluff. They still both break even when Player A is bluffing, since he gets called half the time, and wins the pot the other half.

So what's the big deal? Whether he bluffs with the unexploitable frequency of 50%, the scared nit frequency of 0%, or the maniac frequency of 100%, it seems Player A always breaks even on his bluffs, and Player B also breaks even by calling precisely half the time. This will remain the case so long as Player B sticks rigidly to the Nash equilibrium of calling half the time. But let's say he eventually notices that Player A is either never bluffing, or always bluffing, and decides to adjust his calling frequency.

If he notices Player A never bluffs, then he simply never calls when all he is beating is a bluff. Now Player A isn't losing or winning money bluffing (because he's simply not doing it), but he is also no longer winning value bets when he has the best hand because Player B is folding all the time now. By diverging from Nash, he has allowed his opponent to diverge to an exploitative strategy, one that exploits the fact that he never bluffs.

Now let's consider when Player A goes the maniac route. Once he realises this, Player B simply calls every time he beats a bluff. Now every time Player A bluffs, he loses. This is partially compensated for by the fact that his value bets are always getting called now too, but only partially since he's bluffing far more often than he should be.

In both these cases, Player A switches from Nash equilibrium to an exploitable strategy, and Player B adjusts by switching from Nash to an exploitative strategy to exploit this. Profit here derives not from sticking rigidly to Nash (which merely guarantees you win or lose exactly the same amount regardless of your opponent's strategy). But, and this is quite a big but, when Player B diverges from Nash to exploit Player A, he opens himself up to being exploited. Exploitative strategies are also exploitable ones.

Imagine Player A somehow tricks Player B into thinking he's always bluffing, but actually he's never bluffing. Now he simply declines to ever bluff. He's now not losing any money on bluffing, but all his value bets are getting called. The would-be exploiter has become the exploited.

But enough about poker, back to the far more interesting "guess the hand" game. In this case, the guy who guessed correctly deduced that since his opponent was a smart man, he wouldn't be offering this bet if there wasn't some sort of trick involved designed to influence his guess. Rather than backing himself to figure this out on the fly against an opponent who had a lot more experience in this particular spot, he simply decided to stick to a Nash equilibrium that meant no matter what the trick was, it couldn't put him at a disadvantage. In this case, the Nash equilibrium solution to the "guess the hand" game is to simply decide in advance, randomly, left or right, and stick rigidly to that regardless of what your opponent might do. 

Tags: Dara O'Kearney, strategy