Hi readers! Today we will learn about the Z score. How it is calculated in excel and its interpretation.
If we want to define Z score in simple word then we can say How far is the observation from mean, in terms of standard deviation. So in this case we need to calculate the standard deviation first based on sample or population basis then we will able to calculate the Z score.
What is the use of Z score?
- To identify the outliers.
Formula of Z score = (Observed Value – Mean)/Standard deviation
= Error/Standard deviation
Let’s understand with an example. Data set of five readings of 50, 55 ,45, 50, 80 (n=5)
Mean = (50+55+45+50+80)/5 = 56
Now we need to calculate the standard deviation(σ) of the given data set.
We have to calculate variance first then standard deviation.
Variance (σ2 ) =1/N * ∑ (xi-Mean)2 ( N= Nos of reading)
Standard deviation (σ) = square root of variance = √σ2
Sl. No (n) | Data Set(Xi) | Mean of data set (µ ) | (Xi-µ) | (Xi-µ)^2 | Sum of (Xi-µ)^2 /n = (Variance (σ2 ) | Square root variance = (Standard deviation σ) | Z score = Error/Standard deviation |
X1 | 50 | 56.0 | -6.00 | 36.00 | 154 | 12.4 | -0.48 |
X2 | 55 | 56.0 | -1.00 | 1.00 | -0.08 | ||
X3 | 45 | 56.0 | -11.00 | 121.00 | -0.89 | ||
X4 | 50 | 56.0 | -6.00 | 36.00 | -0.48 | ||
X5 | 80 | 56.0 | 24.00 | 576.00 | 1.93 | ||
Sum | 770.00 |
Z score is calculated in the above table. Suppose organization has declared z score is in between -1.75 to 1.75, so all points are within the defined range except reading point 80.
Let’s do calculation of Z score step by step.
Step 1– Calculate the mean of data set.
Sl No (A) | Data Set(Xi) (B) | Mean of data set =µ ( C) |
X1 | 50 | 56.0 |
X2 | 55 | 56.0 |
X3 | 45 | 56.0 |
X4 | 50 | 56.0 |
X5 | 80 | 56.0 |
Step -2 calculate the difference between Individual data set and mean.
Sl. No (A) | Data Set(Xi) (B) | Mean of data set =µ ( C) | (Xi – µ) D=(B-C) |
X1 | 50 | 56.0 | -6.0 |
X2 | 55 | 56.0 | -1.0 |
X3 | 45 | 56.0 | -11.0 |
X4 | 50 | 56.0 | -6.0 |
X5 | 80 | 56.0 | 24.0 |
Step – 3 Now square the difference (Xi -µ)^2 and do the sum.
Sl. No (A) | Data Set(Xi) (B) | Mean of data set =µ ( C) | (Xi – µ) D=(B-C) | (Xi – µ)^2 E= D^2 |
X1 | 50 | 56.0 | -6.0 | 36 |
X2 | 55 | 56.0 | -1.0 | 1 |
X3 | 45 | 56.0 | -11.0 | 121 |
X4 | 50 | 56.0 | -6.0 | 36 |
X5 | 80 | 56.0 | 24.0 | 576 |
Sum | 770 |
Step – 4 Calculate the variance
Sl No (A) | Data Set(Xi) (B) | Mean of data set =µ ( C) | (Xi – µ) D=(B-C) | (Xi – µ)^2 E= D^2 | Sum of (Xi-µ)^2 /n = (Variance (σ2 ) (F ) |
X1 | 50 | 56.0 | -6.0 | 36 | 154 |
X2 | 55 | 56.0 | -1.0 | 1 | |
X3 | 45 | 56.0 | -11.0 | 121 | |
X4 | 50 | 56.0 | -6.0 | 36 | |
X5 | 80 | 56.0 | 24.0 | 576 | |
Sum | 770 |
Step – 5 Calculate the standard deviation.
Sl. No (A) | Data Set(Xi) (B) | Mean of data set =µ ( C) | (Xi – µ) D=(B-C) | (Xi – µ)^2 E= D^2 | Sum of (Xi-µ)^2 /n = (Variance (σ2 ) (F ) | Square root variance = (Standard deviation σ) (G)= F^.5 |
X1 | 50 | 56.0 | -6.0 | 36 | 154 | 12.4 |
X2 | 55 | 56.0 | -1.0 | 1 | ||
X3 | 45 | 56.0 | -11.0 | 121 | ||
X4 | 50 | 56.0 | -6.0 | 36 | ||
X5 | 80 | 56.0 | 24.0 | 576 | ||
Sum | 770 |
Step – 6 then calculate the Z score.
Sl. No (A) | Data Set(Xi) (B) | Mean of data set =µ ( C) | (Xi – µ) D=(B-C) | (Xi – µ)^2 E= D^2 | Sum of (Xi-µ)^2 /n = (Variance (σ2 ) (F ) | Square root variance = (Standard deviation σ) (G)= F^.5 | Z score H = D/G |
X1 | 50 | 56.0 | -6.0 | 36 | 154 | 12.4 | -0.48349 |
X2 | 55 | 56.0 | -1.0 | 1 | -0.08058 | ||
X3 | 45 | 56.0 | -11.0 | 121 | -0.88641 | ||
X4 | 50 | 56.0 | -6.0 | 36 | -0.48349 | ||
X5 | 80 | 56.0 | 24.0 | 576 | 1.933975 | ||
Sum | 770 |
All data points are within the range of -1.75 to 1.75 except reading point 80 whose Z score is 1.93, so in this case, it is considered as an outlier.
That’s very good…. Very informative
Superb!!
Thanks for your blog, nice to read. Do not stop.