This is only marginally about beer, but since I’m often reading over data, statistics and scientific reports, notions of causation and correlation have become a subject of great interest. This is a Slideshare by Mark Madson, a research analyst with Third Nature in Portland, Oregon. Apparently in schools teaching business, marketing and the like, instructors often include a tale showing a correlation between the sales of beer and diapers, to illustrate thinking in new ways and how seemingly unrelated items might be connected, or could be connected by a savvy company. Having worked retail for many years during various stages of my life, the science of getting a customer’s attention through shelf placement, cross-merchandising and other strategies I find fascinating, in part because it’s a window into human nature itself. In his presentation, Beer, Diapers, and Correlation: A Tale of Ambiguity, Madson examines the oft-related story of a correlation between beer and diapers and tries to find out its origin and whether or not it’s actually true.
The story of the correlation between beer and diaper sales is commonly used to explain product affinities in introductory data mining courses. Rarely does anyone ask about the origin of this story. Is it true? Why is it true? What does true mean anyway?
The latter question is the most interesting because it challenges the ideas of accuracy in data and analytic models.
This is the real history of the beer and diapers story, explaining its origins and truth, based on repeated analyses of retail data over two decades. It will show that one can have multiple contradictory results from analytic models, and how they can all be true.