Standard Scaler

Standard Scaler helps to get standardized distribution, with a zero mean and standard deviation of one (unit variance). It standardizes features by subtracting the mean value from the feature and then dividing the result by feature standard deviation. 

The standard scaling is calculated as: 

z = (x - u) / s

Where,

  • z is scaled data.
  • x is to be scaled data.
  • u is the mean of the training samples
  • s is the standard deviation of the training samples.

Sklearn preprocessing supports StandardScaler() method to achieve this directly in merely 2-3 steps.

Syntax: class sklearn.preprocessing.StandardScaler(*, copy=True, with_mean=True, with_std=True)

Parameters:

  • copy: If False, inplace scaling is done. If True , copy is created instead of inplace scaling.
  • with_mean: If True, data is centered before scaling.
  • with_std: If True, data is scaled to unit variance.

Approach:

  • Import module
  • Create data
  • Compute required values
  • Print processed data

Example:

Python3




# import module
from sklearn.preprocessing import StandardScaler
 
# create data
data = [[11, 2], [3, 7], [0, 10], [11, 8]]
 
# compute required values
scaler = StandardScaler()
model = scaler.fit(data)
scaled_data = model.transform(data)
 
# print scaled data
print(scaled_data)


Output:

[[ 0.97596444 -1.61155897]

 [-0.66776515  0.08481889]

 [-1.28416374  1.10264561]

 [ 0.97596444  0.42409446]]

Data Pre-Processing with Sklearn using Standard and Minmax scaler

Data Scaling is a data preprocessing step for numerical features. Many machine learning algorithms like Gradient descent methods, KNN algorithm, linear and logistic regression, etc. require data scaling to produce good results. Various scalers are defined for this purpose. This article concentrates on Standard Scaler and Min-Max scaler. The task here is to discuss what they mean and how they are implemented using in-built functions that come with this package.

Apart from supporting library functions other functions that will be used to achieve the functionality are:

  • The fit(data) method is used to compute the mean and std dev for a given feature so that it can be used further for scaling.
  • The transform(data) method is used to perform scaling using mean and std dev calculated using the .fit() method.
  • The fit_transform() method does both fit and transform.

Similar Reads

Standard Scaler

Standard Scaler helps to get standardized distribution, with a zero mean and standard deviation of one (unit variance). It standardizes features by subtracting the mean value from the feature and then dividing the result by feature standard deviation....

MinMax Scaler

...