4 minute read

An Origin Story: Guinness and Statistics

Statistics are all around us. Millions of us care about the outcome of elections, which are based on statistics relating to votes. However, few people think about statistics while supping on a pint of Guinness.

One of the most common statistical tests used is the Student’s t-test. You might think this test was developed to model the weather or monitor disease. But no. Its origins are far more vital to our daily lives than that. This test was developed to ensure the delicious taste of Guinness is maintained.

Guinness is a simple beverage made from four ingredients: water, barley, hops, and yeast. Of course, there are closely guarded secrets regarding the production process, but the key is in striking the perfect balance of these ingredients to recreate that distinctive, creamy stout flavour in mass production.

The quality control exerted by Guinness today is plain to see. They have dedicated testers whose job it is to make regular visits to every bar in Ireland that sells Guinness to ensure its high standards are being met.

However, to understand the origins of this level of quality control, we must go back to the early 1900s, when Guinness was already a world leader in the brewery industry, and they were keen to maintain that status.

They succeeded through sheer innovation, recruiting the brightest academic minds to apply the scientific method to the brewing process, including one of the most underrated and greatest minds of the twentieth century: William Sealy Gosset (1876-1937).

Gosset graduated from the University of Oxford in 1899 with firstclass degrees in mathematics and chemistry and went on to work for Guinness as a Master Brewer. Through his advancements in statistics, he revolutionised the process of quality control for mass production in industry, and his statistics remain amongst the most widely used today.

The first statistics test I was taught while studying Psychology in the 1990s at the University of Glasgow was the Student’s t-test. This is also the first statistics test I teach students at ATU Sligo nowadays.

It might be tempting to assume that the name of the test is derived from this fact: it is the first statistical test taught to students. This is not the case. ‘Student’ was a pseudonym used by Gosset because Guinness employees had to publish their work anonymously to prevent their competitors from getting a whiff of the highly innovative scientific methods they were implementing.

One of the early problems Gosset encountered at Guinness is at the very core of the development of the Student’s t-test. Guinness was already creating huge quantities of the black stuff that was being distributed worldwide and the bosses were concerned with ensuring that every batch they distributed complied with their lofty standards.

Back then it was typical for craftspeople to test every individual product before distribution. Even though there would have been a queue of Irish folk willing to do this job for Guinness, stretching from St James’ Gate in Dublin to the Cork coastline, this was simply not tenable (or advisable).

There had to be some way to test the quality of Guinness that did not render half of Ireland drunk, and Gosset’s bosses turned to him and his mathematical mind.

Gosset recognised this as an academic problem. Even though he had sampling techniques to work with, there was a major issue with the statistical tools available.

Statistics were literally a century behind the statistics of today. By the turn of the century in 1900, statisticians understood the general phenomenon of the bell-shaped normal distribution. This distribution captures the order of most naturally occurring phenomena such as the height of people, hence the term ‘normal’. Most peoples’ heights are bundled around the middle peak of the distribution, which represents the mean or average height, with fewer smaller people to the left of the peak and fewer taller people to the right of the peak. This is the natural order of things:

Statisticians had developed mathematical proofs for the ‘standard normal distribution’ by 1900. However, these proofs were based on measures from whole populations and large sample sizes.

This was a major issue for Gosset and Guinness in an industry where the vast majority of product is to be sold for profit, with the smallest quantities possible to be used for testing purposes.

Gosset started with a rudimentary sampling procedure from a large dataset of malt extract. He had an extract with a very large number of samples from which he could be relatively confident about the content. He first drew many two observation samples from the extract to test the accuracy of such small sample sizes.

He noticed that about 80% of the measurements from two samples were within 0.5 units of the target measure of contents; with three measurements there was an 87.5% chance; and with four measurements, a better than 92% chance. With eighty-two measurements, the likelihood of getting within 0.5 units was practically infinite.

Gosset’s bosses at Guinness were ecstatic. This would allow them to make intelligent decisions with confidence to maintain quality control of their stout better than anyone else in the beer business.

However, Gosset was not satisfied with this approach, which he viewed as too much of an approximation, so he sought to decipher the exact mathematics underlying small sample sizes.

He convinced Guinness to allow him to take a year’s sabbatical, during which he joined the lab of Professor Karl Pearson at University College London. There, he worked out the numbers, showing mathematically that even small sample sizes follow a normal distribution, and formally developed the Student’s t-test in the same format as it is used throughout the world today.

So, whether you are a student of statistics, a Guinness drinker, or someone who has benefitted from real-life application of his statistics (that is all of you), Gossett deserves your gratitude. The world we live in today would be a very different place without Gosset or Guinness.

by Martin O’Neill