In the previous topic, we looked at some measures of center location of a data distribution. We will now look at how to measure the variability present in a dataset.
An important aspect in the descriptive study of a data set is the determination of the variability or dispersion of these data relative to the measure of center location of the sample.
Assuming that the mean, the most important measure of location, is the one that defines the main measure of dispersion - the variance, given below.
Variance is defined as the measure obtained by summing the squares of the deviations of the sample observations from their mean and dividing by the number of observations in the sample minus one.
Since the variance involves the sum of squares, the unit in which it is expressed is not the same as that of the data. Thus, to obtain a measure of variability or dispersion with the same units as the data, we take the square root of variance and obtain the standard deviation.
Standard deviation is a measure that can only assume nonnegative values and the larger it is, the greater the data dispersion.
The greater the variability between the data, the greater the standard deviation.
Click here to see example 7Next: Normal Distribution