Quantile Transformation Approaches for Outlier Identification
The quantile transformer transforms features using quantile information. It is applied to each feature independently. The steps are as follows:
- It estimates the cumulative distribution function of a feature.
- Then the values are mapped to the desired output distribution using the associated quantile function.
It applies a non-linear transformation such that the probability density function of each feature will be mapped to a uniform or normal distribution. The formula as follows:
[Tex]G^{-1}(F(X)) \; \;[/Tex]
where F is the cumulative distribution function of the feature and G-1 is the quantile function of G and G is the desired output distribution.
The qunatile transformer makes use of the normal (Gaussian) or uniform distribution technique for data transformation. Let’s discuss them in detail.
1. Uniform Distribution
A uniform distribution means that every value has the same probability of occurring. It has a flat-shaped structure that is equal across the entire range of values; hence, it is also known as a rectangular distribution.
It transforms the cumulative distribution function (CDF) of the input characteristics into a uniform distribution. Here, the minimum and maximum values determine the lower and upper bounds of the range.
2. Normal Distribution (Gaussian)
A normal distribution is a probability distribution that is symmetric about the mean. It has a bell-shaped curve where the data near the mean are more frequent in occurrence than the data far from the mean.
In normal distribution, the quantile function is derived from a simple transformation technique known as the probit function. The probit is the inverse of the cumulative distribution function (CDF) of the standard normal distribution.
Quantile Transformer for Outlier Detection
Data transformation is a mathematical function that changes the data into a scaled value, which makes it possible to compare different columns, e.g., salary in INR with weight in kilograms. Transforming the data will satisfy certain mathematical assumptions such as normalization, standardization, homogeneity, linearity, etc. Quantile Transformer is one of the data transformer techniques for standardizing data.
In this article, we will dig deep into the Quantile Transformer and will understand and implement the significance of quantile transformer for detecting outlier.
Table of Content
- Understanding Quantile Transformer
- Quantile Transformer for Detecting Outliers
- Quantile Transformation Approaches for Outlier Identification
- 1. Uniform Distribution
- 2. Normal Distribution (Gaussian)
- How Quantile Transformer Works for Outlier Detection?
- Utilizing Quantile Transformer for Outlier Detection in Scikit-learn
- Advantages and Disadvantages of Quantile Transformer for Outlier Detection