Visualizing Categorical Data in Python Pandas

Various plots could be used to visualize categorical data to get more insights about the data. So, let us visualize the number of people belonging to each blood type. We will make use of the seaborn library to achieve this.

Python3




sns.countplot(x='blood_type',
              data=without_bogus_records)


Output:

Countplot for blood_type category

Furthermore, we can see the relationship between income and the marital status of a person using a boxplot

Python3




sns.boxplot(x='marriage_status',
            y='income',
            data=inconsistent_data)


Output:

Boxplot for marriage_status with income

Handling Categorical Data in Python

Categorical data is a set of predefined categories or groups an observation can fall into. Categorical data can be found everywhere. For instance, survey responses like marital status, profession, educational qualifications, etc. However, certain problems can arise with categorical data that must be dealt with before proceeding with any other task. This article discusses various methods to handle categorical data in a DataFrame. So, let us look at some problems posed by categorical data and how to handle categorical data in a DataFrame.

As mentioned earlier, categorical data can only take up a finite set of values. However, due to human error, while filling out a survey form, or any other reason, some bogus values could be found in the dataset.

Similar Reads

Importing Libraries

Python libraries make it very easy for us to handle categorical data in a DataFrame and perform typical and complex tasks with a single line of code....

Cleaning Categorical Data in Python

...

Visualizing Categorical Data in Python Pandas

...

Encoding Categorical Data in Python

...