Introduction: I wanted to post a few blogs on simple statistics that can be used in the food industry. I do not mean for this to be an exhaustive paper on the various methods / techniques, but to give a general idea on what can be used and how. Over the course of the next few months, I will be constructing about 5 posts that will outline run charts, histograms, confidence levels, sample sizes, and other items. This is the first of these blogs.
Quick reference: For many years, I have used a reference guide by GOAL/QPC called the Memory Jogger. http://www.goalqpc.com/ . GOAL/QPC is a non-profit organization, but you still have to pay for their material. There are several free sample guides (on a number of different topics) that have a lot of great information.
There are many sites that will sell you programs to create charts and diagrams, you can Google them at will. However, here is a link for a set of youtube videos on how to use Excel 2007 and construct your own charts. http://www.youtube.com/watch?v=03-8vtwCW9c . The link will take you through a four minute session on creation of a run chart. At the end of the video are other topics that I hope you will find interesting.
Histograms:
Histograms are data plots that can give you a lot of information. They can be complex and a bit scary at first, so let Excel help you do the heavy lifting . The above mentioned videos will give you information on how to create a chart in Excel and that program is what I will use for this demonstration (personally I am using Excel 2010 Pro).
In essence, a histogram is a collection of data points from a set of samples - let's say, for example, the weights of jelly filled donuts. To get a reasonably accurate histogram will take about 50 data points. These data points will outline the variability (measurement distribution) in the process - the variation of donut weights in our example. For most processes, we will see a "normal" or bell shaped curve, as show below. If you see multiple peaks, there is something effecting your variation that needs investigating.
You can see that there looks to be a second smaller peak just right of the main peak. This is an indication that you have an issue that needs to be investigated. Personnel training, set-up or raw material variation are all potential causes of this type of variation. Doing a fishbone and 5 why will help find the root cause.
One more thought, Excel will also calculate the standard deviation of the histogram data. The standard deviation will tell you where the Upper and Lower control limits can be set. You do need a clean data set from a process that is running effectively. You can set the upper and lower controls where you wish, however 3 standard deviations (three-sigma) will give you 99.7% of the product being "in spec" on a normal curve.