3.
Production
Process Characterization
3.4.
Data Analysis for PPC
3.4.7.
|
What do I do if my assumptions are not true?
|
|
| Check the normality of the data. |
Many of the techniques discussed in this chapter,
such as hypothesis tests, control charts and capability indices, assume
that the underlying structure of the data can be adequately modeled by
a normal distribution. Many times we encounter data where this is not the
case. |
|
There are several things that could cause the data
to appear non-normal. Some causes might be:
-
The data comes from two or more different sources. This type of data will
often have a multi-modal distribution. This can be solved by identifying
the reason for the multiple sets of data and analyzing the data separately.
-
The data comes from an unstable process. This type of data is nearly impossible
to analyze because the results of the analysis will have no credibility
due to the changing nature of the process.
-
The data was generated by a stable, yet fundamentally non-normal mechanism.
For example, particle counts are non-normal by the very nature of the particle
generation process. Data of this type can be handled using transformations.
|
| We can sometimes transform the data to make it look
normal. |
For this last case, we generally have two type of
transformations to try. The first one is know as standardizing the data.
All we do here is calculate the mean and standard deviation of the data
and then for each data value, we subtract the mean and divide by the standard
deviation. This produces the standardized data set to which we can continue
with our analysis. |
|
The other option is to transform the data using
what is know as a power transformation. The power transformation
is given by the equation: |
|
|
|
where Y is the data and lambda is the transformation
value. Lambda is typically any value between -2 and 2. Some of the more
common values for lambda are 0, 1/2, and -1, which give the following transformations: |
|
|
|
The general algorithm for making non-normal data
appear to be normal is to:
-
Determine if the data is non-normal. (Use normal probability plot and histogram).
-
Find a transformation that makes the data look approximately normal. Some
data sets may include zeros (i.e., particle data). If the data set does
include zeros, you must first add a constant value to the data and then
transform the results.
|
| Example: particle count data. |
As an example, lets look at some particle count
data from a semiconductor processing step. Count data is inherently non-normal.
Below are histograms and normal probability plots for the original data and the
ln, sqrt and inverse of the data. You can see that the log transform does
the best job of making the data appear as if it is normal. All analysis
can be performed on the log-transformed data and the assumptions will be
satisfied. |
| The original data is non-normal, the log transform
looks fairly normal. |
 |
|
neither the square root or the inverse transformation
looks normal
|
|