Answer to Puzzle #14: 52 Cards Win a Dollar
The solution to this problem is, in my opinion the most difficult to understand of all the puzzles. Indeed I was unable to solve it and didn't receive a complete solution until two years after originally posting it. The final solution, in the form of the spreadsheet was sent to me by Han Zheng.
For this reason I have left on the page the thoughts I had before I had the final solution as they represent an easier to understand and more simplistic approach. Also the reasoning may help you arrive at the final solution by yourself or help you understand it. I would recommend reading that answer before you dive into the full answer. But an important thing to note is that as the player we can't lose this game as we can gamble till all the cards are drawn and our net position is zero.
From our earlier analysis it is clear we need a dynamic quit rule. A signal value, (we will quit when we are x dollars ahead,) is not sufficient. We must, at each stage, consider what cards are remaining, and therefore the probability of a positive or negative outcome from drawing again.
What we are about to do is attempt to compare, at any point in the game, the cash we are holding against the mathematically expected return of gambling. Simply if we are holding more cash than the expected return of gambling we will stop. The mathematically expected return will be calculated by the weighted probability of either outcome of the gamble and those positions' expected returns. Which, in turn, will be calculated from their possible outcomes and their probabilities, and so on.
For the explanation I will ask you first to consider a deck containing only 6 cards, 3 +ve & 3 -ve (note I'm no longer calling the cards black and red, it confuses me.)
|Negative Cards Remaining (N)|
|3||.5 0 .5||.4 -1 .6||.25 -2 .75||0 -3 1|
|2||.6 1 .4||.5 0 .5||1/3 -1 2/3||0 -2 1|
|1||.75 2 .25||2/3 1 1/3||.5 0 .5||0 -1 1|
|0||1 3 0||1 2 0||1 1 0||0|
Things to note about the table:
- We start the game with 3 of each card remaining ie the top left
- The number in black represents our cash position at that point in the game
- The red number represents the fractional chance of us moving right ie drawing a -ve card
- The number in green represents the fractional chance of us moving down ie drawing a +ve card
The probability of moving down is given by P/(N+P), that is to say the Positive cards remaining divided by (N+P), the total number of cards remaining. The probability of moving right is N/(N+P) and the cash value of a cell is N-P. N-P because the mumber of Positive cards drawn is 3-P, the cash value is then (3- P)-(3-N)
There are some cells we can intuitively speculate as to our course of action without too much trouble:
If we are in cell 3,0 (ie 3 -ve remaining and 0 +ve remaining, cell looks like 1 3 0) we have 100% chance of negatively affecting our position so we "stick"
In cell 0,1 (no Negative cards remaining, 1 Positive, cell looks like 0 -1 1) we have 100% chance of improving our position, indeed this is true of all the cells in the 0 column
Now consider the cell 1,1 (as an example,) from here we have a 50% chance of improving our position to +1 (moving down,) and a 50% chance of harming it to -1. So the weighted outcome of a gamble might appear to be even. However, if we were to win from this gamble we would quit on +1 in cell 1,0, but if we were to lose this gamble we would "gamble" again from cell 0,1 to a cash position of zero.
This means we can consider the expected return from cell 0,1 to be 0 and from 1,0 to be 1. More formally, for those interested in a more mathematical approach the expected return for N=0 is zero and for P=0 is N
To calculate the expected return from 1,1 we then have 50% chance of 0 plus 50% chance of 1 ie 0.5.
That is to say the expected return, with one of each card remaining is 0.5 which is an improvement on our cash position of zero.
We can now calculate the expected return from 2,1 (cell looks like 2/3 1 1/3,) it's 2/3 moving right (harming) plus 1/3 moving down (improving) or 2/3 of cell 1,1 plus 1/3 of cell 2,0 which calculates as (2/3)*.5 + (1/3)*2 = 1
(note we used the expected return for (1,1) that we calculated earlier.)
Using the same procedure and our expected return from 2,1 (of value 1,) we can do the same for cell 3,1 (cell looks like .75 2 .25)
Expected return is given by .25 * cell(3,0) + .75 * cell(2,1) = .25 * 3 + .75 * 1 = 1.5
So We have that the expected return from cell 3,1 is 1.5. But in cell 3,1 we are holding 2 dollars in our hand, so gambling would not be in our interest, as it would harm our expected return.
This is our quit rule!!!!
We quit when the mathematical expected return of gambling is less than the cash we are currently holding.
We calculate the expected return of gambling as being the probability weighted outcome of the winning and losing cells, where the value assigned to these cells is in-turn their expected return.
This approach can be extended to a 52 card deck.
The spread sheet below does this calculation, the value in each cell is the maximum of either what we are currently holding or the expected return of a gamble. If the former is the maximum then we quit.....
Further Reading...Ben Howe has written a paper on this problem, Red Black Card Game and Generalized Catalan Numbers is a rigorous statistical approach beyond the scope of this site.
My original Solution...
Firstly I must concede I have nothing approaching a complete solution for this, I've consulted many Mathematicians and am still working on a complete solution. However, i have been getting e-mails about this, so i thought I'd give you what i have so far...
What do we mean by a stopping rule????
In this section we will consider the stopping rule:
I will quit when I get 'x' Dollars ahead.Where we will have to determine x.
The first thing to note:We can't lose!!!!! Since there are 26 black cards and 26 red if we are behind we can just keep playing until the end and we will be even, not up, not down.
My first thought was that a good quit strategy given this fact would be to simply quit when we get ahead. This however does not yield the maximum profit.
That is to say x is not equal to 1. Why? Well it is true that whenever we are $1 up necessarily the cards become stacked against us, we are more likely to lose the next gamble. But the most we can lose is $1 since we can play to zero. On the other hand if we continue we could go on to win $3, $4, $5 etc. So well the odds are against us winning the upside is much greater than the downside.
The tool below will run a Monte Carlo simulation in your browser to help determine the value of x with the highest expected return.
Note: no information is sent to me, the calculation is done entirely locally, on your computer.
Enter a quit value.
Number of times to run simulation.
This shows us the best return is when x or our quit value is 4. However there may be more we can do:
Further thinking...A clear limitation of our model is that, what I call the quit value, is static.
We are using a value for x of 4, we have just drawn the 51st card, and we are one dollar ahead.
In this case we know that the next card will be such that we will lose our one dollar. so in this instance we should change our quit value to 1. Clearly this is the beginning of a model that allows us to change the value of depending on how the deck is loaded....
I will continue to play with this idea and would welcome any suggestions for a model along with their expected returns. I'll happily post the results here.
I think Bard had the best stab at this. But it did include the phrase 'the optimal stopping rule can be calculated using a computer program' which is odd given what it is.
If you're curious what Bard made of this puzzle...
The expected value of the game can be calculated as follows:
E = (P(red) * $1) + (P(black) * -$1)where P(red) is the probability of drawing a red card and P(black) is the probability of drawing a black card.
The probability of drawing a red card is equal to the number of red cards remaining in the deck divided by the total number of cards remaining in the deck.
The probability of drawing a black card is equal to the number of black cards remaining in the deck divided by the total number of cards remaining in the deck.
For example, if there are 20 red cards and 32 black cards remaining in the deck, the probability of drawing a red card is 20/52 = 0.385.
The expected value of the game can be calculated for each possible number of cards remaining in the deck. The optimal stopping rule is to stop when the expected value of the game is zero.
The expected payoff following the optimal stopping rule is the amount of money that the player will win on average.
The expected payoff can be calculated by multiplying the expected value of the game by the number of cards remaining in the deck.
For example, if there are 20 red cards and 32 black cards remaining in the deck, the expected payoff is $0.385 * 52 = $19.82.
The optimal stopping rule and the expected payoff following the optimal stopping rule can be calculated using a computer program.
If you're curious what ChatGPT made of this puzzle...
This problem can be solved using the theory of optimal stopping, also known as the secretary problem or the marriage problem.
The optimal stopping rule is to stop drawing cards once the number of black cards drawn is equal to or greater than the number of red cards drawn. This is known as the "majority rule".
To see why this is the optimal rule, suppose you stop after drawing k cards and you have drawn r red cards and b black cards, with r+b=k. If you stop, your payoff is r-b. If you continue, your expected payoff is:
(r+1)/(k+1) - (b+1)/(k+1) = (r-b)/(k+1) + 1/(k+1)
Note that the first term is just your current payoff if you stop, and the second term is the expected value of drawing one more card.
If you continue, the expected value of drawing one more card is always positive (since there are more red cards than black cards in the deck), but it decreases as k increases. Therefore, there exists some k at which the expected payoff of continuing is less than your current payoff, and you should stop.
It turns out that the majority rule achieves the best expected payoff. To see why, note that if you stop before the majority point, your expected payoff is negative (since there are more black cards than red cards in the remaining deck), and if you stop after the majority point, your expected payoff is positive (since there are more red cards than black cards in the remaining deck). Therefore, stopping at the majority point achieves the best expected payoff.
The expected payoff following the optimal rule is (26/51) - (25/51) = 1/51.
© Nigel Coldwell 2004 - – The questions on this site may be reproduced without further permission, I do not claim copyright over them. The answers are mine and may not be reproduced without my expressed prior consent. Please inquire using the link at the top of the page. Secure version of this page.