In this episode we explore how one misunderstood small change caused a very big problem…
What do you do if your shiny new nationally-important product starts going wrong in a completely-inexplicable way just before its very-public launch?
That’s what happened back in 1936 with the M1 Garand, the first semi-automatic rifle adopted for general issue by the US Army:
Everything had been fine up until the last moment. The prototypes had worked perfectly; everything had gone right at the field-trials; no problems at all with the first small in-house pre-production run, built on the officially-approved production tooling. Everything set up and ready to go, to produce that first proper production-run of a million or more: a serious contract for serious money, with a lot of pride and prestige on the line.
But with the first few hundred items off the new production-line, supposedly using the exact same tooling, things suddenly started going weirdly wrong…
It just didn’t make sense. As Ian McCullum describes in his video ‘The M1 Garand's Mysterious 7th Round Stoppage‘, on the ‘Forgotten Weapons‘ YouTube channel, the self-loading mechanism would work perfectly for the first six rounds in the eight-round magazine, but would jam on the seventh round. Not always, and perhaps not every gun - but often enough that it was a huge headache. After all, having one’s weapon suddenly stop working on the battlefield would not be fun for anyone. So yes, something that needed an urgent, urgent fix…
Experiments, experiments; check all the fine details. Why fail on the seventh round, rather than the first or last? And then later, after yet more tests, why was it that this only happened if the magazine had been loaded on one side first, rather than neither or both? More tests, more tests, more and more small changes.
One discrepancy: the two steel guides for the magazine were supposed to be the same height, but they were subtly different in the new production models - one was now slightly shorter than the other. But the guides were the same size on the pre-production models. A difference. A difference that yes, was indeed the cause of the jam.
But why the difference? The production-line used exactly the same tooling as used for the pre-production ones. Or supposed to be the same, anyway. And the documentation from the manufacturers said it was still the same…
It wasn’t.
Someone at the manufacturers had made a small change in their own tooling. Just enough change to nick the top off that magazine-guide on the side that tended to jam. To make things slightly easier for their existing machines, presumably. But they hadn’t told anyone that they’d made that change, hadn’t documented it anywhere: probably thought it didn’t matter. Unfortunately, it did.
Once they’d found all that out, it didn’t take much to fix it. Make another small change to the production-process, so that the jig didn’t trim away that all-important ridge. They were even able to go back to fix the defective items, with a small bit of spot-welding and some attention with a file or some such tool to get it back into the required shape. Reputations saved; trust restored; no harm done. Or not much harm, anyway…
Even so, it’s a good example of huge challenges can be thrown up by even the smallest change. And this was a relatively case: one identifiable problem caused by one identifiable small change. By contrast, in today’s software, built up from layer upon layer or code-libraries, any one of which may be changed without warning, or interact with each other in unexpected ways many layers down in the depths of the code - well, many of those small changes are no fun at all. It may not matter much in a prototype, perhaps - but if it’s something that’s going into production, maybe on a massive scale, it can matter a lot.
Small changes matter: we need to keep track of them as best we can.
A good point in case for DevOps discussions to ensure all aspects are understood by operations.
Hopefully a good reminder to Testing teams that the devil is in the detail.