Next Page Previous Page Handbook Home Tools & Aids Search Handbook
3. Production Process Characterization
3.4. Data Analysis for PPC

3.4.7.

What do I do if my assumptions are not true?

Check the normality of the data. Many of the techniques discussed in this chapter, such as hypothesis tests, control charts and capability indices, assume that the underlying structure of the data can be adequately modeled by a normal distribution. Many times we encounter data where this is not the case.
There are several things that could cause the data to appear non-normal. Some causes might be:
  • The data comes from two or more different sources. This type of data will often have a multi-modal distribution. This can be solved by identifying the reason for the multiple sets of data and analyzing the data separately.
  • The data comes from an unstable process. This type of data is nearly impossible to analyze because the results of the analysis will have no credibility due to the changing nature of the process.
  • The data was generated by a stable, yet fundamentally non-normal mechanism. For example, particle counts are non-normal by the very nature of the particle generation process. Data of this type can be handled using transformations.
We can sometimes transform the data to make it look normal. For this last case, we generally have two type of transformations to try. The first one is know as standardizing the data. All we do here is calculate the mean and standard deviation of the data and then for each data value, we subtract the mean and divide by the standard deviation. This produces the standardized data set to which we can continue with our analysis.
The other option is to transform the data using what is know as a power transformation. The power transformation is given by the equation:
where Y is the data and lambda is the transformation value. Lambda is typically any value between -2 and 2. Some of the more common values for lambda are 0, 1/2, and -1, which give the following transformations:
The general algorithm for making non-normal data appear to be normal is to:
  1. Determine if the data is non-normal. (Use normal probability plot and histogram).
  2. Find a transformation that makes the data look approximately normal. Some data sets may include zeros (i.e., particle data). If the data set does include zeros, you must first add a constant value to the data and then transform the results.
Example: particle count data. As an example, lets look at some particle count data from a semiconductor processing step. Count data is inherently non-normal. Below are histograms and normal probability plots for the original data and the ln, sqrt and inverse of the data. You can see that the log transform does the best job of making the data appear as if it is normal. All analysis can be performed on the log-transformed data and the assumptions will be satisfied.
The original data is non-normal, the log transform looks fairly normal. original data: histogram of particles, probability plot of particles,
histogram of log particles, and probability plot of log particles
neither the square root or the inverse transformation looks normal histogram and probability plots of the square root and inverse
transformation
Handbook Home Tools & Aids Search Handbook Previous Page Next Page