Population Variance

Population variance is a fundamental concept in statistics that quantifies the average squared deviation from the mean of a set of data points in a population. It is a measure of how spread out a group of data points is.

There are two types of data available, namely, ungrouped and grouped data. Thus, there are two formulas to calculate the population variance. In this article, we will learn more about population variance, its formulas, and various associated examples.

Table of Content

  • What is Population Variance?
  • Formula of Population Variance
    • Ungrouped Data
    • Grouped Data
  • Population Variance and Sample Variance
  • Population Variance and Standard Deviation
  • Solved Problems on Population Variance
  • Practice Questions on Population Variance
  • FAQs on Population Variance

What is Population Variance?

Population variance determines how far each data point is from the population mean. It can be defined as the average of the square of the deviations from the data’s mean value. If all data points are very close to the mean, the variance will be small; if data points are spread out over a wide range, the variance will be larger.

Formula of Population Variance

The population variance is a fundamental statistical measure that quantifies the dispersion or variability of a dataset around its mean. Whether dealing with grouped or ungrouped data, understanding the population variance formula is essential for analyzing and interpreting the spread of data points within a population.

Ungrouped Data

Ungrouped data, also known as raw data, consists of individual data points that are not categorized or grouped into intervals. Each data point in ungrouped data represents a distinct value or observation.

Formula of Population Variance in Ungrouped Data:

[Tex]\sigma^2 = \frac{1}{n} \sum_{i=1}^{n} (x_i – \mu)^2 [/Tex]

Where:

  • σ2 is the population variance.
  • N is the total number of data points in the population.
  • xi represents each individual data point.
  • μ is the population mean (average of all data points).
  • ∑ denotes the summation of all terms from i = 1 to N.

Grouped Data

Grouped data refers to a dataset where individual data points are grouped or categorized into intervals or classes. Each interval represents a range of values, and the frequency of data points falling within each interval is recorded.

Formula of Population Variance in Grouped Data:

[Tex]\sigma^2 = \frac{1}{N} \sum_{i=1}^{n} f(m_i – \bar{x})^2 [/Tex]

Where:

  • σ2 is the population variance.
  • N is the total number of data points in the population.
  • f is the frequency of occurrence of an observation.
  • mi is the midpoint of the ith interval.
  • [Tex]\bar{x}[/Tex] is the mean for grouped data.

Population Variance and Sample Variance

The table gives the differences between the population variance and sample variance:

Population Variance

Sample Variance

Population variance is calculated using the entire data set.

Sample variance is calculated using only a sample of the data set.

You calculate the population variance when the dataset you’re working with, represents an entire population, i.e. every value that you’re interested in.

You calculate the sample variance when the dataset you’re working with represents a a sample taken from a larger population of interest.

The formula to calculate population variance is:

[Tex]σ^2 =\dfrac{ Σ (x_i – μ)^2} N[/Tex]

where:

  • Σ: A symbol that means “sum”
  • μ: Population mean
  • [Tex]x_i[/Tex]: The [Tex]i^{th}[/Tex] element from the population
  • N: Population size

The formula to calculate sample variance is:

[Tex]s^2 =\dfrac{ Σ (x_i – x)^2}{(n-1)}[/Tex]

where:

  • x: Sample mean
  • [Tex]x_i[/Tex]: The ith element from the sample
  • n: Sample size

Population Variance and Standard Deviation

Key differences between population variance and standard deviation are:

Aspect

Standard Deviation

Population Variance

Definition

Measures the spread of data points in a population from the population mean.

Measures the dispersion of data points in a population from the population mean.

Formula

[Tex]\sigma^2 = \frac{1}{N} \sum_{i=1}^{N} (x_i – \mu)^2[/Tex]

[Tex]\sigma = \sqrt{\frac{1}{N} \sum_{i=1}^{N} (x_i – \mu)^2}[/Tex]

Units

Squared units of the original data (e.g., square meters, square dollars).

Same units as the original data (e.g., meters, dollars).

Bias Correction

Uses N in the denominator.

Uses N−1 in the denominator.

Representation

σ2

σ

Sensitivity to Outliers

Less sensitive, as it squares differences before averaging.

More sensitive, as it considers absolute differences.

Solved Problems on Population Variance

Problem 1: Suppose we have the heights (in centimeters) of five students: 160, 165, 170, 175, and 180. The mean height is 170 cm. Calculate the population variance.

Solution:

Calculate the squared deviations from the mean:

[Tex](160 – 170)^2 = 100\\ (165 – 170)^2 = 25\\ (170 – 170)^2 = 0\\ (175 – 170)^2 = 25\\ (180 – 170)^2 = 100[/Tex]

Sum up the squared deviations = 100 + 25 + 0 + 25 + 100 = 250

Divide by the total number of observations (which is 5) = 250 / 5 = 50

Therefore, the population variance for this data set is 50 square centimeters.

Problem 2: Suppose we have the following exam scores (out of 100) for a class of 10 students: 78, 85, 92, 70, 88, 95, 81, 79, 90, and 84. The mean score is 84. Calculate the population variance.

Solution:

Calculate the squared deviations from the mean:

[Tex](78 – 84)^2 = 36\\ (85 – 84)^2 = 1\\ (92 – 84)^2 = 64\\ (70 – 84)^2 = 196\\ (88 – 84)^2 = 16\\ (95 – 84)^2 = 121\\ (81 – 84)^2 = 9\\ (79 – 84)^2 = 25\\ (90 – 84)^2 = 36\\ (84 – 84)^2 = 0[/Tex]

Sum up the squared deviations: (36 + 1 + 64 + 196 + 16 + 121 + 9 + 25 + 36 + 0 = 504)

Divide by the total number of observations (which is 10): (504 / 10 = 50.4)

Therefore, the population variance for this data set is 50.4.

Problem 3: Consider daily maximum temperatures (in degrees Celsius) recorded over a week: 28, 30, 29, 31, 27, 28, and 32. The mean temperature is 29. Calculate the population variance.

Solution:

Calculate the squared deviations from the mean:

[Tex](28 – 29)^2 = 1\\ (30 – 29)^2 = 1\\ (29 – 29)^2 = 0\\ (31 – 29)^2 = 4\\ (27 – 29)^2 = 4\\ (28 – 29)^2 = 1\\ (32 – 29)^2 = 9[/Tex]

Sum up the squared deviations = 1 + 1 + 0 + 4 + 4 + 1 + 9 = 20

Divide by the total number of observations (which is 7) = [Tex]20 / 7 \approx 2.86[/Tex]

Therefore, the population variance for this temperature data set is approximately 2.86.

Practice Questions on Population Variance

Q1. The population variance is also called:

a) Sigma squared

b) Sigma cubed

c) Sigma

d) None of the above

Q2. When a sample variance of 25 is obtained from a sample of 10 items from a normal population, the 80% confidence interval for a population variance is:

a) 12.3 to 57.1

b) 13.3 and 67.7

c) 14.1 to 46.25

d) 15.3 to 53.98

3. The sampling distribution of the ratio of independent sample variances from two normally distributed populations with equal variances is the:

a) Chi-square distribution

b) Normal distribution

c) F distribution

d) T distribution

4. These sample results were obtained for independent random samples from two normally distributed populations. Sample 1: Sample Size 10, Sample Variance 25. Sample 2: Sample Size 16, Sample Variance 20. Using a .05 level of significance, which conclusion would be reached for these data?

a) There is a statistically significant difference between the variances of the two populations.

b) There is no statistically significant difference between the variances of the two populations.

c) Insufficient data – can’t tell in this case.

Related Articles:

FAQs on Population Variance

What’s the difference between population variance and sample variance?

Population variance considers the entire population, while sample variance is based on a subset (sample) of the population.

What does the square root of population variance give us?

The square root of population variance gives us the population standard deviation.

What is the significance of population variance in statistical analysis?

Population variance provides insights into the variability of data points around the mean. It helps us understand how spread out the data is within a population. This knowledge is crucial for decision-making, risk assessment, and quality control.

Can population variance be negative?

No, population variance cannot be negative. It is always a non-negative value because it represents the average of squared deviations from the mean.

How does population variance relate to the normal distribution?

In the context of the normal distribution (bell curve), the variance determines the width of the curve. Larger variance results in a wider curve, while smaller variance leads to a narrower curve.

Is there a shortcut formula for calculating population variance?

The shortcut formula for calculating population variance is:

[Tex]\sigma^2 = \frac{1}{n} \left(\sum_{i=1}^{n} x_i^2\right) – \mu^2 [/Tex]