Statistics Interview Questions for Basic Level
1. What is the difference between Descriptive Statistics and Inferential Statistics?
Category | Descriptive Statistics | Inferential Statistics |
---|---|---|
Definition | These statistics are used to summarize the main features of a Data distribution | These statistics are used to draw conclusions about a larger population by using sample data |
Relies on | Descriptive Statistics relies mostly on graphical representation to get meaningful information | Inferential statistics relies on Probability Distribution and Mathematical formulas for meaningful conclusions |
Techniques used | Mean, Median, Mode, Standard Deviation, Range, Histogram, Box Plot, etc. | Hypothesis Test (t-test, z-test, Chi-square test), ANOVA, confidence interval, etc. |
Assumptions | Descriptive Statistics does not involve any kind of assumptions about the population. | Inferential Statistics is often associated with assumptions like Normality, Independence and Random Sampling. |
Example Scenarios | Median salary in a university placement record | The length of flippers for all the Penguins in the world |
2. Difference between Population and Sample
Category | Population | Sample |
---|---|---|
Definition | Population is the entirety of the data that we are interested in. | A sample is the subset of the data that we are interested in. |
Size | Population’s size is large enough to include every member of every group. | Sample’s size is relatively smaller. |
Representation | Population represents the complete data about the group we are interested in. | Sample represents the subset of a population such that it has all the features of the entire population. |
3. What is Random Sampling? What is its use?
Random Sampling is a process of selecting a subset from a population such that it ensures every member of a group in that population has equal chance of getting selected. Random Sampling is used to:
- it helps in making generalizations about the population
- it helps in reducing bias
- helps in extracting meaningful statistical inferences
4. What is Qualitative Data and Quantitative Data?
- Qualitative Data: Qualitative data cannot be explained in numbers. It is also called Categorical Data. It can be divided into groups and classes. Example: Gender, Color, Age category, etc.
- Quantitative Data: Quantitative data, on the other hand is, numerical data. This gives information about the measure of something and can be used in performing mathematical operations. Example: Sales of a car company, Bitcoin Value, etc.
5. What is meant by Probability Distribution?
Probability Distribution is a function that describes the likelihood of possible outcomes of a random event. That means it tells how likely it is for an event to occur and associates a probability to it.
6. What a nominal data and ordinal data?
- Nominal Data: It is a type of Qualitative Data which has no inherent order of rankings. That means this type of data does not have any numerical significance associated with them. Example: Types of Colors, Animal Species, etc.
- Ordinal Data: It is a type of qualitative data which has a defined order of ranking associated with it. Some group are given more preference over others. Example: Education Level, Likert Scale in Survey response, etc.
7. What is the Central Limit Theorem?
Central Limit Theorem states that:
” The sampling distribution of a sample means approaches normal distribution as the sample size increases irrespective of the shape of Population distribution.”
This theorem holds true for sample size greater than 30. For a Sampling Distribution that follows CLT:
- The sampling mean ( [Tex]\overline{x} [/Tex] ) is equal to population mean ( [Tex]\mu [/Tex] )
- The standard deviation of sample distribution( [Tex]\sigma_{s} [/Tex] ) is equal to standard deviation of population distribution ( [Tex]\sigma_{p} [/Tex] ) divided by square root of sample size ( n ).
8. Explain Skewness in Distribution. Why does it happen?
Skewness in distribution refers to the distortion in the data points of distribution, making the shape asymmetric. There are two types of skewness:
- Left/Negative Skewness: This is when the distribution shape is distorted towards left
- Right/Positive Skewness: This is when the distribution shape is distorted towards right
Skewness happens due to the presence of outliers. Outliers in a dataset decides the direction of skewness (positive or negative).
9. What is Normal Distribution? How is it different from a Uniform Distribution in Terms of Measure of Central Tendency?
Category | Normal Distribution | Uniform Distribution |
---|---|---|
Definition | It is a continuous probability distribution which is symmetric about the mean and having most data occurrence at mean. | It is a continuous probability distribution where every value within a given range is equally likely to occur. |
Formula | [Tex]f(x) = \frac1{\sigma{\sqrt{2\pi}}}e^{-\frac{(x-{\mu})^2}{2\sigma^2}} [/Tex] where, [Tex]f(x) [/Tex] = Normal probability density function [Tex]x [/Tex] = Mean of the Normal Distribution [Tex]\sigma [/Tex] = Standard Deviation of Normal Distribution | [Tex]f(x) = \frac1{b-a} , a\leq{x}\leq{b} [/Tex] where, a = minimum of the distribution b = maximum of the distribution x = mean of the distribution |
Shape | It is a bell shaped curve | It is a rectangular shaped curve |
Measure of Central tendency | For Normal Distribution, mean = median = mode. | For Uniform Distribution, mean = median = average of maximum and minimum in the distribution, and mode is undefined. |
10. What is Binomial Distribution?
It is a Discrete probability distribution function that models the number of successes in fixed number of Bernoulli trials, where each trial is either success or failure. The Binomial Distribution function is given as:
[Tex]P(X=k)=\binom{n}{k}p^k(1-p)^{(n-k)} [/Tex], where
n = number of events conducted
p = Probability of the event happening
11. What is an Outlier?
An Outlier is a data point that is significantly different from other data points. Usually, Outliers are present in the extremes of the distribution and stand out as compared to their out data point counterparts.
12. What is the Measure of Center/ Measure of Central Tendency? Explain in brief about it.
Measure of Center/ Measure of Central Tendency is a part of statistics that talks about the “center” of a probability distribution (PD) /dataset. It uses 3 measures of “centers” for it, which are:
- Mean: The average of all the data points present in the dataset.
- Median: The middle data point of the sorted Dataset/PD.
- Mode: The data point which occurs most frequently in a dataset/PD.
13. What is the Measure of Dispersion? Explain in brief about it.
Measure of Dispersion/ Measure of Spread talks about how much distributed the data points are with respect to a single point. Usually, Measure of Dispersion is examined around the mean of the dataset. It explains how “spread out” the data points are around the mean. There are few metrics which tells about the dispersion of a dataset, among which the most used ones are:
- Range: The difference between the minimum and maximum value in the dataset
- Standard Deviation : it is the square root of variance.
- Variance: It is the average of the squared difference of each data point from the mean
14. What is complement rule in probability?
The Complement Rule in Probability states that:
“The probability an event does not occur is one minus the probability of the event occurring”
(Note: The complement Rule holds true for Independent events.)
15. What are Non probability sampling methods? Name a few of them.
Non Probability Sampling methods is based on personal preference of the concerned people. In this type of sampling method, usually sampling is done at the person’s own convenience. Some of the methods are:
- Convenience sample: A probability sampling method where the sample are chosen based on the ease to reach or contact.
- Snowball sample: its a method where initially approached people are given the task to further spread the recruitment of new people, like a snowball pattern.
16. What is Dependent Event and Independent Event?
Category | Dependent Event | Independent Event |
---|---|---|
Definition | Two events are dependent when the outcome of one event is influence by the outcome of another event. | Two events are dependent when the outcome of one event does not affect the outcome of another event. |
Formula | [Tex]P(A\cap{B}) = P(A) \cdot P(B|A) [/Tex] | [Tex]P(A\cap{B}) = P(A) \cdot P(B) [/Tex] |
Example | drawing cards from a deck without replacement | rolling a fair six-sided die |
17. What is margin of error?
It is defined as the maximum expected difference between the population parameter and sample estimate.
18. What is the difference between Poisson Distribution and Bernoulli Distribution?
category | Poisson Distribution | Bernoulli Distribution |
---|---|---|
Definition | A discrete probability distribution used to explain the number of events/ occurrences occurring within a given time period. | A discrete probability distribution used to model the likelihood of binomial (two) events which are success and failure |
Probability Mass Function | [Tex]p(X=x) = {e^{-\lambda}\lambda^{x}}/x! [/Tex] where, X = random event x = number of times the event occurs e = Euler’s constant (2.718) [Tex]\lambda [/Tex] = average number of times an event occurs | [Tex]P(X=x) = p^k(1-p)^{(n-k)} [/Tex] where, x= 0,1 X = random event |
Independence | Used for independent events that occur at a constant rate | The events here may or may not be independent. |
Example | Number of phone calls at a call center in an hour | Success or failure in a product quality test |
Top 50 Plus Interview Questions for Statistics with Answers 2023
Statistics is a branch of mathematics that deals with large amounts of data and the analysis of that data across various industries. Now, if you are looking for career opportunities as a data analyst or data scientist, then knowledge of statistics is very important. Because in most of these interviews, you will encounter statistical questions.
Hence, this blog post aims to explore some of the most frequently asked interview questions in statistics. By the end of this write-up, you will gain comprehensive insights at all levels, ranging from beginners to advanced statistical interview inquiries.
Table of Content
- Short Overview of Statistics
- Statistics Interview Questions for Basic Level
- Statistics Interview Questions for Intermediate Level
- Statistics Interview Questions for Expert Level