Skewness and Kurtosis

We get an understanding of average value as well as the average deviation of a dataset by looking at the measures of central tendency and measures of dispersion. Our next task is to observe the distribution of values and how symmetric/asymmetric the distribution is. Skewness and Kurtosis are described for quantitative variables and are usually not mentioned for categorical or qualitative variables.

Skewness

Skewness is a measure of how asymmetric the distribution of dataset is. For example, a normally distributed curve has zero skewness as it is perfectly symmetric around its mean.

Normal1.2

If the distribution is asymmetric, it could be either positively skewed or negatively skewed depending on the distribution of values around mean.

Positively skewed distribution

DS3.jpg

As seen in the image above, a positively skewed distribution has more number of values to the left and has long tail towards right.

Mean > Median > Mode for a positively skewed distribution

DS6

Negatively skewed distribution

DS4.jpg

As seen in the image above, a positively skewed distribution has more number of values to the left and has long tail towards right.

Mean < Median < Mode for a negatively skewed distribution

DS5.jpg

Measures of skewness

Widely used formula for Skewness is presented below

DS8.jpg

There are many other measures of skewness

Pearson coefficient of skewness

= (Mean – Mode)/Standard deviation    = 3(Mean – Median)/Standard deviation

Quartile coefficient of skewness

= (Q3  +  Q1 – 2*median) / (Q3 – Q1)

Where Q1 is 25th percentile, Q3 is 75th percentile.

Kurtosis

Kurtosis explains how peaked the distribution is in comparison with Normal distribution. Flat distributions have low kurtosis and highly peaked distribution such as the Cauchy distribution has high kurtosis value.

Calculation of kurtosis

DS9