|
1.
Exploratory Data Analysis
1.3. EDA Techniques 1.3.3. Graphical Techniques: Alphabetic
|
|||
|
Purpose: Find transformation to maximize correlation between two variables |
When performing a linear fit of Y against X, an appropriate
transformation of X can often significantly improve the fit.
The Box-Cox transformation
(Box and Cox, 1964)
is a particularly useful family of transformations. It is defined as:
is the transformation parameter.
For = 0, the log of the data is taken
instead of using the above formula.
The Box-Cox linearity plot is a plot of the correlation between
the transformed Y and X for given values of
|
||
| Sample Plot |
The plot of the original data with the predicted values from a linear
fit indicate that a quadratic fit might be preferrable.
The Box-Cox linearity plot shows a value of
|
||
| Definition |
Box-Cox linearity plots are formed by
|
||
| Questions |
The Box-Cox linearity plot can provide answers to the following
questions:
|
||
|
Importance: Find a suitable transformation |
Tranformations can often significantly improve a fit. The Box-Cox linearity plot provides a convenient way to find a suitable tranformation without engaging in a lot of trial and error fitting. | ||
| Related Techniques |
Linear Regression Box-Cox Normality Plot |
||
| Case Study | The Box-Cox linearity plot is demonstrated in the Alaska pipline data case study. | ||
| Software | Box-Cox linearity plots are not a standard part of most general purpose statistical software programs. However, the underlying technique is based on a transformation and computing a correlation coefficient. So if a statistical program supports these capabilities, writing a macro for a Box-Cox linearity plot should be feasible. Dataplot supports a Box-Cox linearity plot directly. | ||