Friday, June 21, 2013

Quantitave vs Qualitative Data

Back on March 29, 2013 I blogged about how to use Histograms as part of Statistical Process Control.  The Histogram I used dealt with numerical data (quantitative).  However, there is another type of data - Qualitative.  How do we plot this qualitative information?

Let's review the previous blog which included quantitative histograms.

In essence, a histogram is a collection of data points from a set of samples - let's say, for example, the weights of jelly filled donuts.  To get a reasonably accurate histogram will take about 50 data points.  These data points will outline the variability (measurement distribution) in the process - the variation of donut weights in our example.  For most processes, we will see a "normal" or bell shaped curve, as show below.  If you see multiple peaks, there is something affecting your variation that needs investigating.


 
Constructing the above histogram takes a few steps.  Step one is to gather your data.  Step two and three is to determine the data range and the number of classes (columns in the graph).  Step three is collecting the data into a frequency table (again the columns that will be used in the graph). Last step is to plot the columns into your histogram.  I do have to say that Excel does a great job on taking your data and quickly sorting it into a histogram ready form.  So here is our example below.

OK - the above is quantitative histograms, i.e. usage of measurement data.  Usage of a histogram is great for this type of information. Qualitative information is a bit different.  In qualitative date we talk in terms of items such as hair colour, gender, type of donut purchased.  The information does not lend itself to sorting into a chart such as a histogram.  In real life an example of this type of data is customer complaints received by customer service.  Yes, there is numerical information (# of jelly donuts with no jelly) but the type of information is qualitative.  This lends itself in being formed into Pareto, general bar graph, pie graphs.  An explanation of a Pareto is below.

A Pareto Chart is a very simple method of Data analysis that, at its heart is the 80 / 20 rule.  80% of the problems are from 20% of the causes.  Pareto charts can be used in charting events like customer complaints, where there are several types of issues.  The best way to proceed is to first collect the complaints and then organize them from most to least frequent.  Excel can do this task from a complaint list which is summarized in pivot table.  This pivot table can be charted and you can also add a line that tallies the frequency of each complaint.  When you reach 80% of the complaints, you focus on the issues that make-up that 80% amount.

 The above is a quick example of jelly donut issues at a retail store.  The left axis is the number of complaints and the right axis is the cumulative frequency of complaints.  Ordered in this way, it is easy to see "filling squirting out"  is the largest complaint but is this the only complaint to focus on?  The cumulative frequency reaches 80% at "donut crumbled".  So, to correct 80% of the problems we need to solve both the jelly squirting and donut crumbling issues.

No comments:

Post a Comment