Exploratory Data Analysis
EDA refers to the deep analysis of data so as to discover different patterns and spot anomalies. Before making inferences from data it is essential to examine all your variables.
So here let’s make a heatmap using seaborn library.
Python3
plt.figure(figsize = ( 12 , 6 )) sns.heatmap(dataset.corr(), cmap = 'BrBG' , fmt = '.2f' , linewidths = 2 , annot = True ) |
Output:
To analyze the different categorical features. Let’s draw the barplot.
Python3
unique_values = [] for col in object_cols: unique_values.append(dataset[col].unique().size) plt.figure(figsize = ( 10 , 6 )) plt.title( 'No. Unique values of Categorical Features' ) plt.xticks(rotation = 90 ) sns.barplot(x = object_cols,y = unique_values) |
Output:
The plot shows that Exterior1st has around 16 unique categories and other features have around 6 unique categories. To findout the actual count of each category we can plot the bargraph of each four features separately.
Python3
plt.figure(figsize = ( 18 , 36 )) plt.title( 'Categorical Features: Distribution' ) plt.xticks(rotation = 90 ) index = 1 for col in object_cols: y = dataset[col].value_counts() plt.subplot( 11 , 4 , index) plt.xticks(rotation = 90 ) sns.barplot(x = list (y.index), y = y) index + = 1 |
Output:
House Price Prediction using Machine Learning in Python
We all have experienced a time when we have to look up for a new house to buy. But then the journey begins with a lot of frauds, negotiating deals, researching the local areas and so on.