Mean value in Data Science

 Mean value in Data Science

Mean of a given attribute from data set calculate the centre or average value of that attribute.

This is a popular measure of central tendency in data science. It is mostly used in finding average, clustering and preprocessing of data.

Mean is the statistical operation is performed on Numerical attribute type.

Below is the formula to calculate Mean value: 



Example: 

For a given data set calculate the mean value of Age of employee in the organisation.


Employee ID

Age

Year of Experience

Department

1001

37

15

Research

1002

25

3

Research

1003

35

10

Payroll

1004

27

9

Research

1005

31

9

HR

1006

40

20

Research

1007

24

2

Research


Mean of Age = ( 37+ 25+35+27+31+40+24)/7 = 31.28.


Weighted Mean:

Sometime, each value of attribute in a set may be associated with weight.

The weight reflect the significance, importance or occurrence frequency attached to their respective values.

Below is the formula to calculate weighted mean: 



Trimmed Mean:

A measure problem with mean is its sensitivity to extreme (outlier) values.

Even a small number of extreme values can corrupt the mean.

For example: High paid salary of few directors will increase the mean value of salary in the organisation. Low marks of few students will significantly reduce the mean value of marks in class.

Comments

Popular posts from this blog

Understanding of Attribute types

Basic Statistical Description of Data