Next Page Previous Page Handbook Home Tools & Aids Search Handbook
1. Exploratory Data Analysis
1.3. EDA Techniques
1.3.3. Graphical Techniques: Alphabetic

1.3.3.5.

Box-Cox Linearity Plot

Purpose:
Find transformation to maximize correlation between two variables
When performing a linear fit of Y against X, an appropriate transformation of X can often significantly improve the fit. The Box-Cox transformation (Box and Cox, 1964) is a particularly useful family of transformations. It is defined as:
where X is the variable being transformed and is the transformation parameter. For = 0, the log of the data is taken instead of using the above formula.

The Box-Cox linearity plot is a plot of the correlation between the transformed Y and X for given values of . The value of corresponding to the maximum correlation (or minimum for negative correlation) on the plot is then the optimal choice for .

Sample Plot sample Box-Cox linearity plot

The plot of the original data with the predicted values from a linear fit indicate that a quadratic fit might be preferrable. The Box-Cox linearity plot shows a value of = 2.0. The plot of the transformed data with the predicted values from a linear fit with the transformed data shows a better fit (verified by the significant reduction in the residual standard deviation).

Definition Box-Cox linearity plots are formed by
  • Vertical axis: Correlation coefficient from the transformed X and Y
  • Horizontal axis: Value for
Questions The Box-Cox linearity plot can provide answers to the following questions:
  1. Would a suitable transformation improve my fit?
  2. What is the optimal value of the transformation parameter?
Importance:
Find a suitable transformation
Tranformations can often significantly improve a fit. The Box-Cox linearity plot provides a convenient way to find a suitable tranformation without engaging in a lot of trial and error fitting.
Related Techniques Linear Regression
Box-Cox Normality Plot
Case Study The Box-Cox linearity plot is demonstrated in the Alaska pipline data case study.
Software Box-Cox linearity plots are not a standard part of most general purpose statistical software programs. However, the underlying technique is based on a transformation and computing a correlation coefficient. So if a statistical program supports these capabilities, writing a macro for a Box-Cox linearity plot should be feasible. Dataplot supports a Box-Cox linearity plot directly.
Handbook Home Tools & Aids Search Handbook Previous Page Next Page