Next Page Previous Page Handbook Home Tools & Aids Search Handbook
1. Exploratory Data Analysis
1.3. EDA Techniques
1.3.3. Graphical Techniques: Alphabetic
1.3.3.14. Histogram

1.3.3.14.5.

Histogram Interpretation: Bimodal Mixture of 2 Normals

Histogram from Mixture of 2 Normal Distributions histogram from mixture of 2 normal distributions
Discussion of Unimodal and Bimodal The histogram shown above illustrates data from a bimodal (2 peak) distribution.

In contrast to the previous example, this example illustrates bimodality due not to an underlying deterministic model, but bimodality due to a mixture of probability models. In this case, each of the modes appears to have a rough bell-shaped component. One could easily imagine the above histogram being generated by a process consisting of two normal distributions with the same standard deviation but with two different locations (one centered at approximately 9.17 and the other centered at approximately 9.26). If this is the case, then the research challenge is to determine physically why there are two similar but separate sub-processes.

Recommended Next Steps If the histogram indicates that the data might be appropriately fit with the mixture of two normal distributions, the recommended next step is:

Fit the normal mixture model using either least squares or maximum likelihood. The general normal mixing model is

where p is the mixing proportion (between 0 and 1) and and are normal probability density functions with location and scale parameters , , , and respectively. That is, there are 5 parameters to estimate in the fit.

Whether maximum likelihood or least squares is used, the quality of the fit is sensitive to good starting values. For the mixture of two normals, the histogram can be used to provide initial estimates for the location and scale parameters of the two normal distributions.

Dataplot can generate a least squares fit of the mixture of two normals with the following sequence of commands:

    RELATIVE HISTOGRAM Y
    LET Y2 = YPLOT
    LET X2 = XPLOT
    RETAIN Y2 X2 SUBSET TAGPLOT = 1
    LET U1 = <estimated value from histogram>
    LET SD1 = <estimated value from histogram>
    LET U2 = <estimated value from histogram>
    LET SD2 = <estimated value from histogram>
    LET P = 0.5
    FIT Y2 = NORMXPDF(X2,U1,S1,U2,S2,P)
Handbook Home Tools & Aids Search Handbook Previous Page Next Page