Next Page Previous Page Handbook Home Tools & Aids Search Handbook
1. Exploratory Data Analysis
1.3. EDA Techniques
1.3.3. Graphical Techniques: Alphabetic
1.3.3.26. Scatter Plot

1.3.3.26.10.

Scatter Plot: Outlier

Scatter plot showing outliers scatter plot showing outliers
Discussion The scatter plot here reveals
  1. a basic linear relationship between X and Y for most of the data, and
  2. a single outlier (at X = 375).
An outlier is defined as a data point which emanates from a different model than the rest of the data. The data here appears to come from a linear model with a given slope and variation except for the outlier which appears to have been generated from some other model.

Outlier-detection is important for effective modeling. Outliers should be excluded from such model-fitting. If all the data here is included in a linear regression, then the resulting fitted model will be poor virtually everywhere. If the outlier is omitted from the fitting process, then the resulting fit will be excellent almost everywhere (for all points except the outlying point).

Handbook Home Tools & Aids Search Handbook Previous Page Next Page