*This post follows on (loosely) from a previous discussion on **maths and computing** and asks what it really means to ‘prove’ something in each discipline.*

An apocryphal story has an Oxbridge maths don lecturing to a group of undergraduates … After some time completely filling a huge blackboard with heavy calculus – with accompanying commentary, he turns to the class and casually notes, “So then, it’s *clear* that …” (the exact claim isn’t important). As he turns to resume his chalk-work, a particularly bold student enquires, “Excuse me, Professor; but is that really* ‘clear’*?” The don steps back and surveys his work; studying the entire board from top-left to bottom-right, with numerous head and eye movements to-and-fro – even some pointing – to cross-check various parts with each other. After a full five minutes of silent contemplation, he turns back to the students, smiles, announces, *“Yes!”*, and carries on as before.

So who’s defining ‘clear’ here?

A less striking, but better-documented version of this, occurred when Andrew Wiles was trying to finish his proof of Fermat’s Last Theorem (FLT). While *he* considered the case well-made, there was a particular flaw in the version he presented for publication, which left the paper’s referees unconvinced. Although certain that the underlying principle was sound, he found he couldn’t express it in a way that satisfied external scrutiny. It *was* clear to him … then it *wasn’t* when he understood why it wasn’t clear to everyone else. Not until he changed his model of the essential process was he able to produce a version that convinced everyone. ‘*Clear*’ isn’t always easy.

Of course, it might seem obvious, in the FLT case, that this is precisely the way peer-review in research *should* work but that *clearly* (see what we did there?) isn’t the metric being applied by the Oxbridge don in the previous example. However, Wiles’s proof of the FLT is over 100 pages long so not a great example for us to work with here. So let’s try something simpler: an exercise …

**Plane Seats**

Suppose a number of passengers are queuing to board a plane, each with an allocated seat (and no spare seats). Suppose further that the first passenger has lost their boarding pass so is allowed by the cabin crew (who are clearly not sticklers for security) to sit in a random seat. Obviously, this may only be a short-term solution since the problem will resurface when the passenger allocated to *that* seat boards. So each subsequent passenger then sits in their allocated seat if it’s free or takes a random seat if it isn’t, and so on … Now, here’s the question … *What’s the probability that the very last passenger sits in their allocated (correct) seat?*

We’re going to ruin the suspense in this discussion by starting with the answer because there are bigger points to be made. You can be forgiven for thinking (and most *do*) that the probability depends on the number of passengers (seats) and might well get smaller as the number increases. However, this turns out not to be the case: in fact, the probability for *any* number of passengers/seats above one is always *½* (*0.5*). Why? And what would it take to *convince* us of that?

Well, to save space in this post, we could take this, this or this as starting points. There are various arguments here of various degrees of clarity and persuasion. How many of these might constitute a formal (mathematical) proof? Probably none of them really although the ones mentioning ‘reduction’, ‘recursion’ and ‘iteration’ might be on the right path. There’s yet another way we might attack the problem though, although this is *very* informal …

Start with just two passengers. (It’s trivial with just one.) Either the first passenger sits in their own seat or they don’t, leaving the second (last) passenger with a one-in-two chance of sitting in *their* own right seat. Easy. What about three passengers? Well, we might need some notation …

Suppose we write *n n-1 … 3 2 1 > 1 2 3 … n-1 n* to represent the passengers, (on the left of the *>*) queuing to get on the plane (with *1* first, then *2*, etc.) (the *>* is the plane door) and sit in the *allocated* seats (on the right of the *>*). (If you think about it, the order doesn’t really matter.) Then let (for example) *n n-1 … 3 > 1 2 3 … n-1 n* show the situation after some passengers (in this case two of them) have boarded and taken certain seats (in this case *2* and *n-1*). Then the two passenger case is modelled as follows …

We start with *2 1 > 1 2*. After the first passenger boards, we have either *2 > 1 2* or *2 > 1 2*. So clearly there is a one-in-two chance of passenger *2* sitting in the correct (allocated) seat *2*. We should be able to manage three now …

We start with *3 2 1 > 1 2 3*. After the first passenger boards, we have *3 2 > 1 2 3* or *3 2 > 1 2 3* or *3 2 > 1 2 3*. When the second passenger boards, the first option leads to *3 > 1 2 3* (the second passenger *has* to sit in their allocated seat if it’s free), the second leads to *3 > 1 2 3* or *3 > 1 2 3* and the last to *3 > 1 2 3*. Two of these four outcomes result in the third and final passenger sitting in the right seat, once more giving a probability of *½* (*0.5*).

And we could continue this analysis with more passengers … but we won’t. Instead, we’ll try a bit of common-sense, but entirely informal, reasoning …

What do we know? Well, we know from simple observation that the probability isn’t *1* (certain) or *0* (impossible) so it’s somewhere in between. We know that for two and three passengers, it’s *0.5* and further long-hand working would show the same is true for four, five, etc. That makes it pretty unlikely that the probability’s a function of *n*, the number of passengers. It’s not *1/n*, for example – although that might have been a decent intuitive guess at the outset? So, if it’s *0.5* for the first few values, it’s very likely indeed it stays that way for any others. (There are a few mathematical sequences that deviate from an obvious pattern after the first few terms but they’re very rare.) Common sense definitely suggests that the answer’s *0.5*.

However, to *prove* this formally, we’d probably have to use a proper *inductive approach* (rather than just hint at one, which is what we’ve just done.) We start by assuming (as a sort of ‘experiment’) that the probability *is* *0.5* for some arbitrary number of passengers, *n*, then prove by direct reasoning (formal logic) that it’s also *0.5* for *n+1* passengers. This means that, as it’s true for *2*, it must be true for *3*, which makes it true for *4*, then *5*, and so on … This isn’t meant to be an overtly mathematical blog so that exercise is left to the interested reader (and there’s some help in the links above).

As computer scientists, however, there’s another way … We could try to write a *program* to work it all out … better still, to *show what’s happening*. So that’s exactly what we’ll do. The following C# code will do the trick …

As this is a statistical (probabilistic) exercise, we need a few runs to let the chances even out a bit. Hence the option for the number of simulations as well as the number of passengers. Start with small numbers for both to begin with to see the scenarios work themselves through.

If you look at a few runs carefully, you’ll probably see the essential pattern, which leads to possibly the best of the intuitive proofs, which runs like this …

*There are really only TWO important seats … the seat the FIRST passenger SHOULD have sat in and the seat the LAST passenger SHOULD sit in. Consequently, there are only TWO key events … As soon as ANYONE sits in the FIRST passenger’s seat, the last passenger WILL eventually get to sit in their seat (we have to remember that passengers in between always sit in their allocated seats if they can); but as soon as ANYONE sits in the LAST passenger’s seat, the last passenger CAN’T sit in their seat. It’s just a question of which of these happens first. Assuming everything’s completely random, these two are equally likely – both with probability 0.5.*

Once you’re happy with the process, you can try larger values but you might want to remove, or comment out, the two output lines inside the centre loop and just concentrate on the overall results. Otherwise this could slow it down unnecessarily.

And you’ll get the idea … As the number of simulations increases, the laws of large numbers mean that the final passenger sits in their allocated seat about half the time. So that seems pretty conclusive. Or is it?

Well, we’re forced to return to the question of the validity of a computer generated proof. Is this *really* an appropriate way to formally *prove* something? Only now, there are a few added issues – the first one’s simple … *Are you confident you can follow what’s the progam’s doing?* Don’t confuse operation with output. The output *suggests* the program’s simulating sensibly but that could be just smoke-and-mirrors. Does the program *itself* make sense? Unfortunately, this can only really be answered by a programmer – so we’ll leave it because it’s not as significant as the next question, … This is, assuming you *can* see what’s the program’s trying to do, is it *really* doing it?

What we’re really asking here is are there any *assumptions* in the coding? We might accept that the code was written by someone who understands the *question* but is there any chance that they’re expecting a particular *answer*? Is there any bias in the way the code’s been written?

A simple example of this – maybe too simple – is in the last few lines of output. ‘s’ is the number of simulations. ‘rightSeat’ is the number of times the last passenger sits in the correct seat: the program counts this as it goes. So where’s ‘wrongSeat’, which presumably should be counted in the same way? There *is* no ‘wrongSeat’. In fact, it’s calculated as ‘s – rightSeat’. Is this a justified assumption?

Well, it may seem obvious that it is but just consider carefully for a moment. It’s justified *if* we’ve understand the problem correctly *and* we’ve coded correctly. If we haven’t, or if we’ve introduced our own expectations into the program then there are a number of reasons why we might be better off counting ‘wrongSeat’ independently. Two obvious ones are:

- There might be a third (or more) outcomes to the sequence of passengers, which we haven’t considered. (In which case, we would have introduced our misunderstanding of the problem into our coding.)
- We might not be adding ‘rightSeat’ up properly. (We’ve coded incorrectly.) Having an independent count for ‘wrongSeat’ would have a good chance of exposing the error if ‘rightSeat + wrongSeat’ doesn’t equal ‘s’.

This is a trivial example but there may be more serious ones in the above code. There are nested loops and arrays galore. Can we really be sure that the code is both *accurate* and *unbiased*? *How many opportunities for going astray can you list?*

**Monty Hall**

We’ll switch to a different example to continue this discussion. Most will have heard of the Monty Hall problem by now. It describes a game-show in which a host randomly hides a prize (conventionally, a car) behind one of three closed doors. Either nothing or something rubbish (often goats – no offence intended to goats) are placed behind the other two.

A contestant, hoping to win the car, starts by choosing one of the three doors. This door remains closed for now but the host then opens one of the other doors, always exposing a goat. The contestant is then given a simple choice … Do they want to stick with their initial choice of door or choose the remaining (closed) door? Whichever they now go with is their final choice and the door is opened.

It’s generally known today, although possibly not quite so widely believed, that the contestants is actually better off changing their choice of closed door. Again, this may not seem obvious – and indeed has confused some eminent thinkers in the past – so how would we go about proving this?

Well, again, simple reasoning (short of a mathematical proof) works pretty well up to a point. We have to assume that neither the contestant nor the host is entirely stupid in their choices so it’s not that hard to make general observations …

In fact, in the critical sense, there are only two possibilities for the first choice: the contestant is either right (with probability 1/3) or wrong (with probability 2/3) …

If they’re right on the first guess, then the host opens either of the wrong doors. If they ‘stick’, they stay right but if they ‘switch’ (to the other wrong door), they become wrong.

If they’re wrong on the first guess, the host opens the other wrong door. If they stick to the wrong door, they stay wrong but if they switch to the remaining door, they become right.

So sticking with their first choice keeps the original probabilities of right: 1/3; wrong: 2/3, whereas switching reverses the probabilities to right: 2/3; wrong: 1/3. Clearly switching is the best strategy. (What you’re really gambling on by switching is the better probability that you were wrong in the first place.)

Convinced? Well, many haven’t been … and still aren’t. So let’s try the other approach, shall we? Here’s the C# program …

Try it … See what it does … That seems to do the trick, doesn’t it? So, the point is well made, yes?

Well, possibly not. This time there are *definitely* short-cuts in the way this code is written that assume we know how the maths actually works. As this is what we’re trying to prove, this is the coding equivalent of circular reasoning. How many can you see this time? One, at least, is obvious: the calculation of right and wrong totals is pretty much the same as before only, this time, it’s considerably *more* of an assumption. Can you see why?

And there are a number of other areas where the code is either unclear, unproven or makes assumptions in the way its written, based on some believed properties of the problem. For example:

- The line ‘open = 6 – car – guess’ is supposed to open the door that’s not the car
*or*the contestant’s wrong guess. Does it? - The line ‘open = (car + r.Next(1, 3)) % 3 is supposed to randomly open a wrong door when the contestant has guessed right.
*Really?* - We have a variable ‘stickWin’ but where’s ‘stickLose’, ‘changeWin’ and ‘changeLose’? We seem to be assuming that these variables are complementary (they combine in different dimensions to sum to the number of simulations ‘s’.
*But this ‘inverse’ argument is very much the central point of the solution: we can hardly build it into the simulation*. - … and a good few more …

**In particular, in this case, the observation that changing your guess completely reverses the outcome (losing becomes winning and winning becomes losing) is central to the argument that changing is a good idea because it reverses the relative probabilities. But this is essentially what we’re trying to prove. We really can’t start with it as an assumption when we write the code.**

These are huge assumptions if our aim is to convince an independent observer. So, considering both problems together, is there a simple solution to this? Can we make our code both unambiguous *and* remove assumptions?

Well, taken separately, we often can … *just about*. Writing our code to be as simple as possible – breaking down every line to its simplest components can both make it clear to anyone we’re trying to convince *and* help *us* to work one step at a time through the problem … *not* mistakenly start with the solution (in other words, to start with an assumption of what we’re trying to prove). It gets harder when we try for both together. How *can *we be sure we’re coding the *problem*, not the *solution*? *It ain’t easy!*

This has been a fairly lengthy post so that’s probably enough for now but we clearly (*whose ‘clearly’*) haven’t finished …

*Which means, at some point in the future, we’re going to have to discuss program verification*, aren’t we?

## So what do you think?