What are outliers?

Outliers are basically data points in the dataset that significantly differ from the majority of the data. For example, if the majority of the data is lying in the range of 1–50 and one or two points are lying at 125–150, then these points are termed outliers. These are the values that are exceptionally high or low in comparison to the other data points, which leads to distortions in the overall statistical analysis and interpretation of the data. It cannot be termed noise or error.

Outliers may result from various factors, including errors in data collection, measurement errors, or genuine variations in the data.

It can be identified through statistical techniques or visual methods, such as box plots, scatter plots, or the use of various outlier detection algorithms. Depending on the context and the cause of the outliers, they can be treated in different ways, including removal, transformation, or separate analysis to understand their potential impact on the dataset and the research or analysis being conducted.

Outliers are of three types, namely:

  1. Global outliers are data points that significantly deviate from the rest of the data in a dataset, irrespective of any specific conditions or contexts.
  2. Multivariate outliers are data points that are outliers when considering multiple attributes or dimensions simultaneously.
  3. Contextual outliers

Let’s dive deep into contextual outliers, also known as conditional outliers.

Contextual Outliers

Understanding contextual outliers is essential across various fields, including statistics, finance, and anomaly detection, as they offer valuable insights into unique events or conditions that impact the data. By identifying and analyzing these outliers, we gain a deeper understanding of the nuances within our datasets, enabling us to make more informed decisions and draw meaningful conclusions within specific contexts.

This article explores the fascinating world of contextual outliers, shedding light on their significance and how they differ from global outliers. We’ll illustrate the concept with real-world examples, demonstrating how contextual outliers emerge when certain conditions or events come into play.

Similar Reads

What are outliers?

Outliers are basically data points in the dataset that significantly differ from the majority of the data. For example, if the majority of the data is lying in the range of 1–50 and one or two points are lying at 125–150, then these points are termed outliers. These are the values that are exceptionally high or low in comparison to the other data points, which leads to distortions in the overall statistical analysis and interpretation of the data. It cannot be termed noise or error....

Contextual outliers

These are the outlier that is identified within a specific context or condition. In other words, the data point which may not be considered an outlier normally, but when we take into account the domain knowledge they become outliers. They are often analyzed in situations where the data varies based on different factors or attributes....

Difference between Global and Contextual outlier

Global Outlier...