E-mail Me, feed back appreciated.
Home

Answer to question #14

 

You have 52 playing cards (26 red, 26 black). You draw cards one by one. A red card pays you a dollar. A black one fines you a dollar. You can stop any time you want. Cards are not returned to the deck after being drawn. What is the optimal stopping rule in terms of maximizing expected payoff?

Also, what is the expected payoff following this optimal rule?

The solution to this problem is, in my opinion the most difficult to understand of all the puzzles. Indeed I was unable to solve it and didn't receive a complete solution until two years after originally posting it. The final solution, in the form of the spreadsheet was sent to me by Han Zheng.

For this reason I have left on the page the thoughts i had before I had the final solution as they represent an easier to understand and more simplistic approach. Also the reasoning may help you arrive at the final solution by yourself or help you understand it. I would recommend reading that answer before you dive into the full answer. But an important thing to note are that as the player we can't lose this game as we can gamble till all the cards are drawn and our net position is zero.

From our earlier analysis it is clear we need a dynamic quit rule. A singal value is not sufficent. We must, at each stage consider what cards are remaining, and therefor the probability of a positive or negative outcome from drawing again.

For the explanation i will ask you first to consider a deck containing only 6 cards, 3 +ve & 3 -ve (note i'm no longer calling the cards black and red, it confuses me.)





Negative Cards Remaining (N)
3
2
1
0
+ve
Cards
remaining
(P)
3
.5 0 .5 .4 -1 .6
.25 -2 .75 0 -3 1
2
.6 1 .4
.5 0 .5
1/3 -1 2/3
0 -2 1
1
.75 2 .25
2/3 1 1/3
.5 0 .5
0 -1 1
0
1 3 0
1 2 0 1 1 0
0


Things to note about the table:

  • We start the game with 3 of each card remaining ie the top left
  • The number in black represents our cash position at that point in the game
  • the red number represents the fractional chance of us moving right ie drawing a -ve card
  • the number in green represents the fractional chance of us moving down ie drawing a +ve card
The probability of moving down is given by P/(N+P), where N+P gives the total number of remaining cards, the probability of moving right is N/(N+P) and the cash value of a cell is N-P.

There are some cells we can intuitively speculate as to our course of action without too much trouble:

If we are in cell 3,0 (ie 3 -ve remaining and 0 +ve remaining) we have 100% chance of negatively affecting our position so we "stick"

In cell 0,1 we have 100% chance of improving our position, indeed this is true of all the cells in the 0 column

Now consider the cell 1,1 (as an example,) from here we have a 50% chance of improving our position to +1 and a 50% chance of harming it to -1. So the weighted outcome of a gamble might appear to be even. However, if we were to win from this gamble we would quit on +1 in cell 1,0, but if we were to lose this gamble we would "gamble" again from cell 0,1.

This means we can consider the expected return from cell 0,1 to be 0 and from 1,0 to be 1

To calculate the expected return from 1,1 we then have 50% of 0 and 50% of 1 ie 0.5

We can now calculate the expected return from 2,1 it's 2/3 moving lright plus 1/3 moving down or 2/3 of cell 1,1 plus 1/3 of cell 2,0 which calculates as (2/3)*.5 + (1/3)*2 = 1

(note we used the expected return for (1,1) that we calculated earlier.)

using the same procedure and our expected return from 2,1 we can do the same for cell 3,1

expected return is given by .25 * cell(3,0) + .75  * cell(2,1) = .25 * 3 + .75 * 1 = 1.5

So we have that the expected return from cell 3,1 is 1.5. But in cell 3,1 we are holding 2 dollars in our hand, so gambling would not be in our interest, as it would harm our expected return.

This is our quit rule!!!!

We quit when the mathematical expected return of gambling is less than the cash we are currently holding.

We calculate the expected return of gambling as being the probability weighted outcome of the winning and losing cells, where the value assigned to these cells is inturn their expected return.

This approach can be extented to a 52 card deck.

The spread sheet below does this calculation, the value in each cell is the maximum of either what we are currently holding or the expected return of a gamble. If the former is the maximum then we quit.....





My original Solution.....

Firstly I must concede I have nothing approaching a complete solution for this, I've consulted many Mathematicians and am still working on a complete solution. However, i have been getting e-mails about this, so i thought I'd give you what i have so far....

What do we mean by a stopping rule????

A Stopping rule could be anything, it could be 'I'll stop after just one game', 'I'll wont play the game...' etc...

In this section we will consider the stopping rule:

I will quit when I get 'x' Dollars ahead.

Where we will have to determine x.

The first thing to note:

We can't lose!!!!! Since there are 26 black cards and 26 red if we are behind we can just keep playing until the end and we will be even, not up, not down.

My first thought was that a good quit stratergy given this fact would be to simply quit when we get ahead. This however does not yield the maximum profit.

That is to say x is not equal to 1.

To determine that value of  x i have written a simple computer program. It will allow you to enter various values for x then use this value  to play the game 1,000,000 times, (takes < a minute on a newish computer.) You will be able to see what your profit is as the game progresses....

Screen Shot
(Click to download  - 33k)

The source code for this program is available here, it's written in QBasic.

So we now have a value for x, and and expected return. Incase you didn't do it yourself, you can highlight the line below for the answer. which for a large part is the solution to this problem.
x = 4, expected return ~= 1.0917

However there may be more we can do:

Further thinking...

A clear limitation of our model is that, what i call the quit value, is static.

consider this:

We are using a value of x of 4, we have just drawn the 51st card, and we are one dollar ahead.

In this case we know  that the next card  will be such that we will lose our one dollar. so in this instance we should change our  quit value to 1. Clearly this is the begining  of a model that allows us to change the value of depending on how the deck is loaded....

I will continue to play with this idea and would welcome any suggestions  for a model along with their expected returns. I'll happily post the results here.

BACK

 












































This web site is written by nigel coldwell, you can see my main site at free nokia ringtones if you want to contact me, or have a puzzle for me then feel free to E - Mail me.