Probabilities in the game Camel Up (or Camel Cup)

July 2015

Without Stacking
Stacking
- Approach 1: Simulation
- Results
Leg Simulations
Code, etc.

Lately we've been playing a lot of the game Camel Up (or Camel Cup?), and it leads to some interesting math problems.

First let's start with the basics. There are five camels, and each one starts on a random square from 1-3. Every leg, each camel moves either 1, 2, or 3 squares, and the game is over when a camel reaches square 16. (there are stacking rules, but let's ignore those for now)

So the first question is: How many legs does a game take on average?

For one camel this doesn't seem to hard - on average they start on square 2, and on average they move 2 squares per turn. But with five camels, the game is over as soon as the fastest camel gets to square 16 - how can we count this?

Without Stacking

Approach 1: Simulation

This seems like a tricky problem, so my first approach was just to write up some Python to simulate 20000 races and try it out. (for all of these approaches I just let all the camels start at square -1, then subtract one from the final number of legs) Here's what simulateRace() looks like in Python (see the bottom of the post for the actual code):

def simulateRace(numCamels, numSquares):
    totalSteps = 0
    ITERATIONS = 20000
    for i in range(ITERATIONS):
        camelPositions = [-1] * numCamels
        while (max(camelPositions) < numSquares):
            for j in range(numCamels):
                camelPositions[j] += random.randint(MIN_CAMELSTEP, MAX_CAMELSTEP)
            totalSteps += 1
    return float(totalSteps) / ITERATIONS

This is pretty straightforward. There are a few problems, though:

How many iterations are enough? Good question! I don't realy know, although I guess you could try a number, run it a few times and see if you get consistent results. For these purposes 20000 does seem to be enough.
It's kind of slow. If \(c\) is the number of camels and \(s\) is the number of steps they have to go, then this takes \(O(cs)\) time and \(O(c)\) space. (and there's a pretty large constant factor for the time!
It's only an approximation. Which is fine for all practical purposes, but I like exact results!

Approach 2: Dynamic programming (by camel state)

My next thought was to use a dynamic programming approach where the subproblems are the positions of all the camels. This was the best I could think of - the positions of all the camels matters, because the state where one camel is at 16 and the rest are at 8 is very different than if they're all at 16 (for example). Here's what the code looks like:

cache = {}

def getAdder(length):
    return itertools.product(range(MIN_CAMELSTEP, MAX_CAMELSTEP+1), repeat=length)

def cycleThroughStates(camelState, adder):
    l = len(camelState)
    for a in adder:
        newState = list(camelState)
        for i in range(l):
            newState[i] += a[i]
        yield newState

# takes in a camel state (list of camel positions) and returns
# the expected value of how many moves are left
def calculateRaceDynamicImpl(camelState, adder):
    if (max(camelState) >= numSquares):
        return 0
    camelState.sort()
    if (tuple(camelState) in cache):
        return cache[tuple(camelState)]
    bigTotal = 0
    count = 0
    for newState in cycleThroughStates(camelState, adder):
        bigTotal += calculateRaceDynamicImpl(newState, adder, numSquares)
        count += 1
    answer = bigTotal/float(count) + 1.0
    cache[tuple(camelState)] = answer
    return answer

def calculateRaceDynamic(numCamels, numSquares):
    adder = list(getAdder(numCamels))
    cache = {}
    return calculateRaceDynamicImpl([-1] * numCamels, adder, numSquares)

This also works, but is time and memory-intensive - since there are \(s^c\) states, it takes \(O(s^c)\) space and time! (not all of the states are reachable, but I think enough of them are that it's still \(O(s^c)\).)

So this was disappointing. I thought there must be a better way, maybe involving standard deviation or something. But after brunch and a conversation with David, inspiration struck!

Approach 3: Dynamic programming (by die roll)

This uses dynamic programming to calculate the probability that one camel will finish in less than a given number of legs. It does this by using a state of [number of squares left, die rolls left], which is pretty straightforward. Then it uses this result to figure out the same thing for \(c\) camels with some math. Here's how it looks:

def probFinishLessThanEqualDieRollsIter(curState, cached):
    if curState in cached:
        return cached[curState]
    numSquares = curState[0]
    dieRollsLeft = curState[1]
    if (numSquares <= 0):
        return 1.0
    if (dieRollsLeft <= 0):
        return 0.0
    totalProb = 0.0
    for step in range(MIN_CAMELSTEP, MAX_CAMELSTEP+1):
        totalProb += probFinishLessThanEqualDieRollsIter((numSquares - step, dieRollsLeft - 1), cached)
    totalProb *= 1.0 / (MAX_CAMELSTEP + 1 - MIN_CAMELSTEP)
    cached[curState] = totalProb
    return totalProb

# Calculates the probability that one camel will finish in
# less than or equal to this many die rolls.
def probFinishLessThanEqualDieRolls(numSquares, dieRolls, cached):
    state = (numSquares, dieRolls)
    return probFinishLessThanEqualDieRollsIter(state, cached)

def calculateRaceByDieRolls(numCamels, numSquares, returnDistribution=False):
    # the camels effectively start at square -1, so they actually have to go one extra square
    numSquares = numSquares + 1
    # figure out the range of die rolls that are possible
    minDieRolls = math.ceil(numSquares/MAX_CAMELSTEP)
    maxDieRolls = math.ceil(numSquares/MIN_CAMELSTEP)
    expected = 0.0
    lastProb = 0.0
    cached = {}
    listOfProbs = []
    for d in range(minDieRolls, maxDieRolls+1):
        prob = probFinishLessThanEqualDieRolls(numSquares, d, cached)
        probLEQThisWithAnyCamel = 1.0 - math.pow(1.0-prob, numCamels)
        assert probLEQThisWithAnyCamel <= 1.0
        probExactlyThisManyWithAnyCamel = probLEQThisWithAnyCamel - lastProb
        lastProb = probLEQThisWithAnyCamel
        expected += d * probExactlyThisManyWithAnyCamel
    return expected

This works well. There are \(O(s^2)\) states so it takes \(O(s^2)\) space and \(O(s^2)\) time, which is pretty good for an exact solution!

Here's a summary of the performance of the different solutions - remember, \(s\) is the number of squares and \(c\) is the number of camels:

	s=16,c=5	s=16,c=100	s=50,c=5
Approach 1	1.5 sec	28.2 sec	6.1 sec
Approach 2	4.6 sec	(runs out of memory)	598.9 secs
Approach 3	0.0 sec	0.0 sec	0.0 sec

Behold the power of Approach 3! I'm actually a little surprised it's so much faster than Approach 1 - I guess there really is a big constant factor hidden in that \(O()\) notation!

Results

With s=16,c=5 it takes an average of 6.53 legs.
Here's a graph of the probability of a race with s=16,c=5 ending after the given number of legs:

And here's a graph of the expected number of legs to finish a s=16 race versus number of camels:

Stacking

As I mentioned above, one of the interesting aspects of the game that we've ignored until now is stacking. When there's more than one camel on a square, they stack, and when a camel moves, it brings all the camels above it along with it. When a camel lands on a square with an existing camel, it lands on top of that camel.

The stacking rules make the game much more exciting and unpredictable, so let's look at what they do to the average length of a game!

Approach 1: Simulation

Yeah, I have no idea how to do this other than simulation. (if anyone has any mathy ideas, let me know!) Here's what the code looks like:

def simulateRaceWithStacking(numCamels, numSquares):
    totalSteps = 0
    ITERATIONS = 20000
    for i in range(ITERATIONS):
        # extra slack for the last leg
        boardPosition = [[] for i in range(numSquares + MAX_CAMELSTEP * numCamels)]
        camelPositions = [-1] * numCamels
        # first initialize, they don't move with stacking on the first turn
        totalSteps += 1
        for i in range(numCamels):
            camelPositions[i] += random.randint(MIN_CAMELSTEP, MAX_CAMELSTEP)
            boardPosition[camelPositions[i]].append(i)
        while (max(camelPositions) < numSquares):
            camelOrder = list(range(numCamels))
            random.shuffle(camelOrder)
            # first camel in list is on the bottom
            for camelNum in camelOrder:
                curPosition = camelPositions[camelNum]
                newPosition = curPosition + random.randint(MIN_CAMELSTEP, MAX_CAMELSTEP)
                positionInStack = boardPosition[curPosition].index(camelNum)
                camelsToMove = boardPosition[curPosition][positionInStack:]
                boardPosition[curPosition] = boardPosition[curPosition][:positionInStack]
                assert len(camelsToMove) >= 1
                for c in camelsToMove:
                    camelPositions[c] = newPosition
                    boardPosition[newPosition].append(c)
            totalSteps += 1
    return float(totalSteps) / ITERATIONS

Pretty straightforward, although we keep track of both the board state and which square each camel is in (for efficiency).

Results

With s=16,c=5 it takes an average of 4.73 legs, which is almost 2 whole legs less than without stacking! Here's the distribution graph - notice that it's possible for the game to end in 2 legs!

And here's a graph of the expected number of legs to finish a s=16 race versus number of camels:

Leg Simulations

Now that I had the ability to simulate a leg, I thought I'd set up some common scenarios and figure out the probabilities for the first and second place camels.

The most likely winner of the leg is bolded.

Scenario: (this means that camel 1 is on top of camel 0)

1
0

	Camel 0	Camel 1
First place	33%	67%
Second place	67%	33%

Scenario: (this means that camel 1 is one square ahead of camel 0)

	Camel 0	Camel 1
First place	39%	61%
Second place	61%	39%

Scenario: (this means that camel 1 is on top of camel 0, and camel 2 is one square ahead)

1
0

	Camel 0	Camel 1	Camel 2
First place	21%	49%	30%
Second place	32%	25%	43%

Scenario:

1
0

	Camel 0	Camel 1	Camel 2
First place	13%	39%	48%
Second place	30%	30%	40%

Scenario:

2
1
0

	Camel 0	Camel 1	Camel 2	Camel 3
First place	13%	28%	43%	16%
Second place	18%	30%	24%	28%

Scenario:

2
1
0

	Camel 0	Camel 1	Camel 2	Camel 3
First place	10%	24%	38%	28%
Second place	15%	26%	24%	35%

Scenario:

1
0

	Camel 0	Camel 1	Camel 2	Camel 3
First place	13%	37%	18%	32%
Second place	19%	22%	27%	32%

Scenario:

1
0

3
2

	Camel 0	Camel 1	Camel 2	Camel 3	Camel 4
First place	12%	29%	13%	30%	16%
Second place	13%	19%	19%	24%	25%

Scenario:

2
1
0

	Camel 0	Camel 1	Camel 2	Camel 3	Camel 4
First place	12%	24%	35%	11%	18%
Second place	12%	23%	21%	18%	26%

Interesting stuff! As I suspected, being on top of a stack is very beneficial for a camel. Even being the second camel in a three-high stack is pretty good! But being at the bottom of a stack hurts a lot.

Code, etc.

Here's the Python 3.x file I used to do all these calculations. Just uncomment what you want in the main method:

camelcup.py

Fancy math equations by MathJax. Fancy graphs by Flot. Camel Up (or Camel Cup?) designed by Steffen Bogen.