|
7.
Product and Process Comparisons
7.4. Comparisons based on data from more than two processes
|
|||
| Contingency
Table approach
Industrial example
Contingency table classifying defects in wafers according to type
and production shift
Column probabilities
Row probabilities
Expected cell frequencies
Estimated expected cell frequency when H0 is true.
df = (r-1)(c-1)
Testing the null hypothesis
|
When items are classified according
to two or more criteria, it is often of interest to decide whether these
criteria act independently of one another.
For example, suppose we wish to classify defects found in wafers produced in a manufacturing plant, first according to the type of defect and, second, according to the production shift during which the wafers were produced. If the proportions of the various types of defects are constant from shift to shift, then classification by defects is independent of the classification by production shift. On the other hand, if the proportions of the various defects vary from shift to shift, then the classification by defects depends upon or is contingent upon the shift classification and the classifications are dependent. In the process of investigation whether one method of classification is contingent upon another, it is customary to display the data by using a cross classification in an array consisting of r rows and c columns called a contingency table. A contingency table consists of r x c cells representing the r x c possible outcomes in the classification process. Let us construct an industrial case: A total of 309 wafer defects were recorded, and the defects were classified as being one of four types, A, B, C, or D. At the same time each wafer was identified according to the production shift in which it was manufactured, 1, 2, and 3. These counts are presented in the following table.
Let pA be the probability that a defect will be of type A. Likewise, define pB, pC,and pD as the probabilities of observing the other three types of defects. These probabilities, which are called the column probabilities, will satisfy the requirement
![]() For example, the probability that a particular defect will occur in shift 1 and is of type A is (p1) (pA). While the numerical values of the cell probabilities are unspecified, the null hypothesis states that each cell probability will equal the product of its respective row and column probabilities. This condition implies independence of the two classifications. The alternative hypothesis is that this equality does not hold for at least one cell. In other words we state the null hypothesis as H0: the two classifications are independent, while the alternative hypothesis is Ha: the classifications are dependent. To obtain the observed column probability, divide the column total by the grand total, n. Denoting the total of column j as cj, we get ![]() ![]()
In other words, when the row and column classifications are independent, the estimated expected value of the observed cell frequency nij in an r x c contingency table is equal to its respective row and column totals divided by the total frequency. ![]() From here we use the expected and observed frequencies shown in the table to calculate the value of the test statistic
The number of degrees of freedom associated with a contingency table consisting of r rows and c columns is (r-1) (c-1).So for our example we have (3-1) (4-1) = 6 d.f. In order to test the null hypothesis we compare the test statistic with the critical value of c2 at a selected value of a. Let us use a = .05. Then the critical value is c205;6 , which is 12.5916 (see the chi square distribution critical value table in Chapter 1). Since the test statistic of 19.18 exceeds the critical value, we reject the null hypothesis and conclude that there is significant evidence that the proportions of the different defect types vary from shift to shift.In this case, the p-value of the test statistic is .00387.
|
||