# A Christmas Cracker Algorithm!

T’is the season to be jolly … and silly.  So here’s a seasonal and jolly silly example of why it’s hard to implement high-level languages efficiently.  Liberties are taken with the hardware/software relationship in some parts of the analogy but, hey, it’s Christmas!

Let’s write an algorithm for pulling a Christmas cracker …

• Buy a box of crackers and bring home
• Take a cracker out of the box
• Get two people to hold an end each
• Pull in opposite directions
• Have fun with what’s inside
• Clear up the mess

That’s probably going to be enough detail for most people (and more than enough for some).  However, if you’re the one that’s been entrusted with the initial purchase or the child told to do the clearing up, you might want a bit more to go on; what’s actually involved in that bit?  And who are the ‘two people’ anyway?  OK then, if needed, we can easily expand this a touch …

• Buy a box of crackers and bring home
• Find a box that’s big enough for everyone
• Pay for the crackers
• Put in a bag
• Take home
• Take a cracker out of the box
• Find a cracker in the box
• Place between person X and person Y
• Get two people (X and Y) to hold an end each
• X grasps one end
• Y grasps the other end
• X and Y pull in opposite directions
• Have fun with what’s inside
• X puts on paper hat
• Y plays with plastic toy
• X asks Y the awful joke
• Clear up the mess
• Pick up the remains of the cracker
• Pick up the toy and joke
• Throw in the bin with the paper hat
• Wipe the table

Here, each black bullet is expanded – explained further – by the indented white bullets.  This is one common way to design algorithms.  But, OK, this is hardly perfect.  There’s still a bit left to the imagination and different homes have different variants.  We’re not trying to impose anything here – and it’s certainly not necessary to throw everything away immediately after pulling the cracker.  This is only a silly example, remember, although, in fact, it is a pretty decent expansion of precisely how X and Y would pull one cracker.  If the steps above are the essential ‘Christmas code’ for the operation then we can define a simplified ‘Xmas code’ operation, PullCracker(X, Y), as follows:

PullCracker(X, Y) =

1. Find a box that’s big enough for everyone
2. Pay for the crackers
3. Put in a bag
4. Take home
5. Find a cracker in the box
6. Place between person X and person Y
7. X grasps one end
8. Y grasps the other end
9. X and Y pull in opposite directions
10. X puts on paper hat
11. Y plays with plastic toy
12. X asks Y the awful joke
13. Pick up the remains of the cracker
14. Pick up the toy and joke
15. Throw in the bin with the paper hat
16. Wipe the table

Now, the great thing about Xmas code, rather than Christmas code, is that we can issue simplified Xmas code commands like PullCracker(Sue, Ken) without having to worry about the detail of the underlying Christmas code.  Christmas code is really a ‘low-level’ description of the algorithm and Xmas code the ‘high-level’ equivalent.  Sue and Ken, in a high-level Xmas code program simply become X and Y in the low-level Christmas code program that’s actually carried out for real.  We say PullCracker(Sue, Ken) for convenience, knowing that, ‘on the ground’, it’s steps 1-16 with X=Sue, Y=Ken that actually get the job done.  Great; good system.  What could go wrong?

Well, things start to look a bit iffy when we remember that there are a number of crackers in the box and we’re probably going to want to pull several of them.  Just extending one cracker (for two people) to two (for four), for the time being, gives a simple enough Xmas code program of

• PullCracker(Sue, Ken);
• PullCracker(Dave, Jo);

but how does this now translate to the underlying Christmas code?  Well, if we carry out the algorithm exactly then we run steps 1-16 twice in two separate blocks.  The ‘whole of’ PullCracker(Sue, Ken) comes before the ‘whole of’ PullCracker(Dave, Jo) so steps 1-16 are completed once for X=Sue, Y=Ken then completed again for X=Dave, Y=Jo.  That’s every step twice; and that’s pretty daft …

Why?  Well, for a start, we only need to run steps 1-4 once, at the beginning, for the whole box of crackers – we don’t want to go to the shop again between PullCracker(Sue, Ken) and PullCracker(Dave, Jo).  There’s no need because we already have the crackers.  We’d also only do step 16 once at the end as well; the poor put-upon child would feel even more put-upon if we made them wipe the table twelve times after each cracker from a box of twelve.  Also, in practice, steps 5 & 6 and 13-15 would probably be combined into a more efficient ‘before’ and ‘after’ operation – while laying the table and the big clean-up afterwards.  It’s really only steps 7-12, that’s six of the sixteen steps, that actually have to be executed sequentially in this manner for each cracker.  The implementation of the Xmas code for one cracker in Christmas code has become dreadfully inefficient for two or more but there’s not much we can do about it at this level because it has to be that way to work for just the one.  We can’t not buy the crackers in the first place or tidy up afterwards.  Our lovely simplicity has cost us in efficiency.  And it obviously just gets worse for more crackers.

And – and here’s the lesson – that’s how it is with real computer code too.  The extraordinary power of high-level languages is the result of combining many low-level, or machine code, instructions into each high-level operation.  The apparently simple operation, C = A + B, hides a complex sequence of low-level instructions in which system states have to be saved, registers initialised, data retrieved and loaded from memory, fed through arithmetic circuits, results stored back to memory and systems restabilised.  It’s all entirely necessary when we consider these operations in isolation precisely so we can run more than one operation without them disrupting each other’s data.  However, when we run, say, C = A + B followed by D = A – B, many of the low-level instructions seem to be unnecessary because the data to be worked on (A and B) is (or was) already where it needed to be and much of the system state could have been left alone in between.  In ‘cracker’ terms, we’re fetching data we already had and clearing up only to make the mess again.

And this, in a nutshell, is why high-level languages are generally less efficient than low-level ones.  The power and ease-of-use of high-level languages is paid for by inefficient implementation at the lower level.  High-level instructions have to be complete; they can’t make assumptions about the states of registers or leave loose ends after they’ve finished but often, in practice, that means doing things at the lower-level that don’t need to be done in many – even most – cases.  It’s generally accepted that, if you want to write truly efficient code – if you really want to wring the very last drop of performance from a computer, then you’re better off writing low-level code … if you can.  In the cracker example, it’s pretty straightforward to rewrite the Christmas code to pull several crackers efficiently – the result is just long, and inelegant.

The problem is, there are a lot more high-level language programmers than low-level ones because high-level programming is, well, sort of easy.  More precisely, high-level programming gets a lot done – quickly – without worrying about too much fine detail or the need to understand the underlying machine architecture.  Nearly all code these days is written in high-level languages, which means nearly all of it isn’t as efficient as it could be.  Exactly, how bad it is depends on a number of features of the problem being solved (see How to Write a Really Bad Program) but there’s always room for improvement somewhere.

And, yes, this is a real problem.  So, are there solutions?  Well, yes, there are many, but none are perfect.  Going back to the cracker analogy, there are various ways we might try to speed up the process.  The obvious way would be to pull crackers simultaneously, parallel processing in a computational sense; but this would mean having to have duplicates of key resources.  This means multiple registers and arithmetic units – or complete processors – in the computer while, with the crackers, it gets people pulling crackers two or more pairs at a time.  That might seem OK on the surface but it doesn’t work if we try to execute the Xmas code operations PullCracker(Sue, Ken) and PullCracker(Sue, Jo) at the same time.  (Well, it might at a push, if Sue has two free hands, but it would involve restructuring the underlying Christmas code and that would defeat the object.  Anyway three simultaneous operations involving the same person would definitely mess things up.)

Another approach (pipelining, to give it its posh term) could involve queuing up Christmas code steps to use key resources.  This is a bit like singing a song ‘in a round‘.  Various Xmas code operations would be at different steps of Christmas code at any one time.  The result would be that crackers could be pulled, not exactly simultaneously but, in quick succession if all the necessary preparation was carried out in advance.  This is also good in theory but can fail in practice if the execution of one Xmas code operation depends on the result of previous operations.  For example, John might only want to pull another cracker if he didn’t like the hat in the first.  We’d have to get a fair way through the previous pipeline before we knew what to do.  How would we cope with this efficiently?  Guess?  Wait?  Do it anyway just in case?  This is precisely the decisions that have to be made when designing computer architectures and their low-level instruction sets.

There are numerous other partial solutions too.  And, in fact, modern processor design is likely to be a complex mixture of all of them; in many ways mirroring the near anarchy of a typical Christmas dinner – apparent chaos but somehow getting the job done.  In the real world, crackers do get pulled and, in the computer world, high-level code does get implemented, often within a few percent of the efficiency of the equivalent (i.e. doing the same job – not the direct translation) low-level code.  But it’s messy; blimey is it messy?

Here endeth the Christmas lesson.  A final note though: this example also works pretty well – possibly better really – with ‘feeding the cat’; but never mind – it’s Christmas.  The following exercise is ‘left to the reader’: Write a ‘kitchen code’ algorithm for the ‘house code’ instruction, FeedCat(X), which finds a tin of cat food, prepares the necessary hardware, gives X his/her dinner and cleans up afterwards.  Then consider the efficiency of the two house code instructions

• FeedCat(Tom);
• FeedCat(Sam);

compared to the underlying kitchen code.

Merry Christmas and a Happy New Year!