In descriptive statistics, the seven-number summary is a collection of seven summary statistics, and is a modification or extension of the five-number summary. There are two common forms.
As with the five-number summary, it can be represented by a modified box plot, adding hatch-marks for two of the additional numbers.
Contents |
The following numbers are parametric statistics for a normally distributed model:
The middle three values – the lower quartile, median, and upper quartile – are the usual statistics from the five-number summary and are the standard values for the box in a box plot.
The two unusual percentiles at either end are used because the locations of all seven values will be equally spaced if the data is normally distributed. Some statistical tests require normally distributed data, so the plotted values provide a convenient visual check for validity of later tests, simply by scanning to see if the locations of those seven percentiles appear to be equally spaced.
Notice that whereas the five-number summary makes no assumptions about the distribution of the data, the (parametric) seven-number summary is based on the normal distribution, and is not especially appropriate when normal data is not expected. However, the non-parametric seven number summary, discussed below, makes no assumptions.
The values can be represented using a modified box plot. The 2nd and 98th percentiles are represented by the ends of the whiskers, and hatch-marks across the whiskers mark the 9th and 91st percentiles.
Arthur Bowley used a set of non-parametric statistics, called a "seven-figure summary", including the extremes, deciles and quartiles, along with the median[1].
Thus the numbers are: