Dataframe Slicing and Observation

Sorting, Reindexing, Renaming, Reshaping, Dropping

A. Observation

We can view top 5 rows with head() methods

Python3

# Print first 5 rows
print(df.head())

Output:

   FRUITS  QUANTITY  PRICE
0   Mango        40     80
1   Apple        20    100
2  Banana        25     50
3  Orange        10     70

We can view the top last 5 rows with tail() methods.

Python3

# Print Last 5 rows
print(df.tail())

Output:

   FRUITS  QUANTITY  PRICE
0   Mango        40     80
1   Apple        20    100
2  Banana        25     50
3  Orange        10     70

sample() methods return the ith number of rows.

Python3

# Randomly select n rows
print(df.sample(3))

Output:

   FRUITS  QUANTITY  PRICE
2  Banana        25     50
0   Mango        40     80
1   Apple        20    100

Python3

# Select top 2 Highest QUANTITY
print(df.nlargest(2, 'QUANTITY'))

Output:

   FRUITS  QUANTITY  PRICE
0   Mango        40     80
2  Banana        25     50

Python3

# Select Least 2 QUANTITY
print(df.nsmallest(2, 'QUANTITY'))

Output:

   FRUITS  QUANTITY  PRICE
3  Orange        10     70
1   Apple        20    100

Python3

# Select the price > 50
print(df[df.PRICE > 50])

Output:

   FRUITS  QUANTITY  PRICE
0   Mango        40     80
1   Apple        20    100
3  Orange        10     70

B. Select Column data

Python3

# Select the FRUITS name
print(df['FRUITS'])

Output:

0     Mango
1     Apple
2    Banana
3    Orange
Name: FRUITS, dtype: object

Python3

# Select the FRUITS name and
# their corresponding PRICE
print(df[['FRUITS', 'PRICE']])

Output:

   FRUITS  PRICE
0   Mango     80
1   Apple    100
2  Banana     50
3  Orange     70

Python3

# Select the columns whose names match 
# the regular expression
print(df.filter(regex='F|Q'))

Output:

   FRUITS  QUANTITY
0   Mango        40
1   Apple        20
2  Banana        25
3  Orange        10

C. Subsets of rows or columns

Python3

# Select all the columns between Fruits and Price
print(df.loc[:, 'FRUITS':'PRICE'])

Output:

   FRUITS  QUANTITY  PRICE
0   Mango        40     80
1   Apple        20    100
2  Banana        25     50
3  Orange        10     70

Python3

# Select FRUITS name having PRICE <70
print(df.loc[df['PRICE'] < 70,
             ['FRUITS', 'PRICE']])

Output:

   FRUITS  PRICE
2  Banana     50

Python3

# Select 2:5 rows
print(df.iloc[2:5])

Output:

   FRUITS  QUANTITY  PRICE
2  Banana        25     50
3  Orange        10     70

Python3

# Select the columns having ) 0th & 2nd positions
print(df.iloc[:, [0, 2]])

Output:

   FRUITS  PRICE
0   Mango     80
1   Apple    100
2  Banana     50
3  Orange     70

For more please refer to this article Indexing and Selecting data

Dataframe

	FRUITS	QUANTITY	PRICE
0	Mango	40	80
1	Apple	20	100
2	Banana	25	50
3	Orange	10	70

Python3

# Select Single PRICE value at 2nd Postion
df.at[1, 'PRICE']

Output:

Python3

# Select the single values by their position
df.iat[1, 2]

Output:

Filter

Filter by column name

Python3

print(df.filter(items=['FRUITS', 'PRICE']))

Output:

   FRUITS  PRICE
0   Mango     80
1   Apple    100
2  Banana     50
3  Orange     70

Filter by row index

Python3

# Filter by row index
print(df.filter(items=[3], axis=0))

Output:

   FRUITS  QUANTITY  PRICE
3  Orange        10     70

Where

Python3

df['PRICE'].where(df['PRICE'] > 50)

Output:

0     80.0
1    100.0
2      NaN
3     70.0
4     60.0
5      NaN
Name: PRICE, dtype: float64

Query

Pandas query() methods return the filtered data frame.

Python3

# QUERY
print(df.query('PRICE>70'))

Output:

  FRUITS  QUANTITY  PRICE
0  Mango        40     80
1  Apple        20    100

Python3

# Price >50 & QUANTITY <30
print(df.query('PRICE>50 and QUANTITY<30'))

Output:

   FRUITS  QUANTITY  PRICE
1   Apple        20    100
3  Orange        10     70

Python3

# FRUITS name start with 'M'
print(df.query("FRUITS.str.startswith('M')", ))

Output:

  FRUITS  QUANTITY  PRICE
0  Mango        40     80

Pandas Cheat Sheet for Data Science in Python

Pandas is a powerful and versatile library that allows you to work with data in Python. It offers a range of features and functions that make data analysis fast, easy, and efficient. Whether you are a data scientist, analyst, or engineer, Pandas can help you handle large datasets, perform complex operations, and visualize your results.

This Pandas Cheat Sheet is designed to help you master the basics of Pandas and boost your data skills. It covers the most common and useful commands and methods that you need to know when working with data in Python. You will learn how to create, manipulate, and explore data frames, how to apply various functions and calculations, how to deal with missing values and duplicates, how to merge and reshape data, and much more.

If you are new to Data Science using Python and Pandas, or if you want to refresh your memory, this cheat sheet is a handy reference that you can use anytime. It will save you time and effort by providing you with clear and concise examples of how to use Pandas effectively.

Dataframe Slicing and Observation

A. Observation

B. Select Column data

C. Subsets of rows or columns

Dataframe

Filter

Filter by column name

Filter by row index

Where

Query

Pandas Cheat Sheet for Data Science in Python

Similar Reads