Create DataFrame with DateTime Index
To create a DataFrame with a DateTime index, we first need to create a DateTime range and then pass it to pandas.DataFrame method.
Python3
import pandas as pd from datetime import datetime import numpy as np range_date = pd.date_range(start = '1/1/2019' , end = '1/08/2019' ,freq = 'Min' ) df = pd.DataFrame(range_date, columns = [ 'date' ]) df[ 'data' ] = np.random.randint( 0 , 100 , size = ( len (range_date))) print (df.head( 10 )) |
date data 0 2019-01-01 00:00:00 49 1 2019-01-01 00:01:00 58 2 2019-01-01 00:02:00 48 3 2019-01-01 00:03:00 96 4 2019-01-01 00:04:00 42 5 2019-01-01 00:05:00 8 6 2019-01-01 00:06:00 20 7 2019-01-01 00:07:00 96 8 2019-01-01 00:08:00 48 9 2019-01-01 00:09:00 78
Explanation:
We first created a time series then converted this data into DataFrame and used the random function to generate the random data and map over the dataframe. Then to check the result we use the print function.
To do time series manipulation, we need to have a DateTime index so that DataFrame is indexed on the timestamp. Here, we are adding one more new column in the Pandas DataFrame.
Basic of Time Series Manipulation Using Pandas
Although the time series is also available in the Scikit-learn library, data science professionals use the Pandas library as it has compiled more features to work on the DateTime series. We can include the date and time for every record and can fetch the records of DataFrame.
We can find out the data within a certain range of dates and times by using the DateTime module of Pandas library.
Let’s discuss some major objectives of time series analysis using Pandas library.
Objectives of Time Series Analysis
- Create a series of date
- Work with data timestamp
- Convert string data to timestamp
- Slicing of data using timestamp
- Resample your time series for different time period aggregates/summary statistics
- Working with missing data
Now, let’s do some practical analysis of some data to demonstrate the use of Pandas’ time series.