How to Load Boston Dataset in Sklearn

To load the Boston Housing dataset in sklearn, you can use the load_boston function from sklearn.datasets. However, it’s important to note that as of version 1.2, the use of load_boston() is deprecated in scikit-learn due to ethical concerns regarding the dataset. The recommended approach is to use an alternative dataset like the California housing dataset or to download the CSV from a trusted source if you still need to use the Boston dataset specifically for educational purposes.

Syntax of Boston Dataset in Sklearn

Syntax: sklearn.datasets.load_boston()

In this following code we will load Sklearn dataset.

Python

import pandas as pd
from sklearn.datasets import load_boston

# Load the dataset
boston = load_boston()
df = pd.DataFrame(boston.data, columns=boston.feature_names)

# Display the DataFrame
print(df)

Output:

CRIM    ZN    INDUS    CHAS    NOX    RM    AGE    DIS    RAD    TAX    PTRATIO    B    LSTAT
0    0.00632    18.0    2.31    0.0    0.538    6.575    65.2    4.0900    1.0    296.0    15.3    396.90    4.98
1    0.02731    0.0    7.07    0.0    0.469    6.421    78.9    4.9671    2.0    242.0    17.8    396.90    9.14
2    0.02729    0.0    7.07    0.0    0.469    7.185    61.1    4.9671    2.0    242.0    17.8    392.83    4.03
3    0.03237    0.0    2.18    0.0    0.458    6.998    45.8    6.0622    3.0    222.0    18.7    394.63    2.94
4    0.06905    0.0    2.18    0.0    0.458    7.147    54.2    6.0622    3.0    222.0    18.7    396.90    5.33

Boston Dataset in Sklearn

In this article, we are going to see how to use Boston Datasets using Sklearn.

The Boston Housing dataset, one of the most widely recognized datasets in the field of machine learning, is a collection of data derived from the Boston Standard Metropolitan Statistical Area (SMSA) in the 1970s. This dataset is commonly used in regression analysis to predict the median value of homes in the Boston area based on various predictive variables.