|
1.
Exploratory Data Analysis
1.3. EDA Techniques 1.3.3. Graphical Techniques: Alphabetic 1.3.3.26. Scatter Plot
|
|||
| Scatter plot showing heteroscedastic variability |
|
||
| Discussion |
This scatter plot reveals an approximate linear relationship
between X and Y, but more importantly, it reveals a statistical
condition referred to as heteroscedasticity (that is,
different variation). For a heteroscedastic data set, the
vertical variation in Y differs depending on the value of X. In
this example, small values of X yield small scatter in Y while
large values of X result in large scatter in Y.
Heteroscedasticity complicates the analysis somewhat, but its effects can be overcome by:
|
||
| Impact of ignoring unequal variability in the data | Fortunately, unweighted regression analyses on heteroscedastic data produce estimates of the coeffcients which are unbiased. However, the coefficients will not be as precise as they would be with proper weighting. It is worth noting that weighting is only recommended if the weights are known or if there is suffficient reason for assuming that they are of a certain form; for example, it may be known that a process varies proportionately or inversely with X. | ||