Slicing and indexing time series data
CSV file is imported in this example and a column with string data is converted into DateTime using pd.to_timestamp() method. That particular column is set as an index which helps us slice and index data accordingly. data. loc[‘2020-01-22’][:10] indexes data on the day ‘2020-01-22’ and the result is further sliced to return the first 10 observations on that day.
To view and download the CSV file click here.
Python3
# importing pandas import pandas as pd # reading csv file data = pd.read_csv( 'covid_data.csv' ) # converting string data to datetime data[ 'ObservationDate' ] = pd.to_datetime(data[ 'ObservationDate' ]) # setting index data = data.set_index( 'ObservationDate' ) print (data.head()) # indexing and slicing through the dataframe print (data.loc[ '2020-01-22' ][: 10 ]) |
Output:
Unnamed: 0 Province/State ... Deaths Recovered ObservationDate ... 2020-01-22 0 Anhui ... 0.0 0.0 2020-01-22 1 Beijing ... 0.0 0.0 2020-01-22 2 Chongqing ... 0.0 0.0 2020-01-22 3 Fujian ... 0.0 0.0 2020-01-22 4 Gansu ... 0.0 0.0 [5 rows x 7 columns] Unnamed: 0 Province/State ... Deaths Recovered ObservationDate ... 2020-01-22 0 Anhui ... 0.0 0.0 2020-01-22 1 Beijing ... 0.0 0.0 2020-01-22 2 Chongqing ... 0.0 0.0 2020-01-22 3 Fujian ... 0.0 0.0 2020-01-22 4 Gansu ... 0.0 0.0 2020-01-22 5 Guangdong ... 0.0 0.0 2020-01-22 6 Guangxi ... 0.0 0.0 2020-01-22 7 Guizhou ... 0.0 0.0 2020-01-22 8 Hainan ... 0.0 0.0 2020-01-22 9 Hebei ... 0.0 0.0 [10 rows x 7 columns]
In this example, we slice data from ‘2020-01-22’ to ‘2020-02-22’.
Python3
# importing pandas import pandas as pd from datetime import datetime # reading csv file data = pd.read_csv( 'covid_data.csv' ) # converting string data to datetime data[ 'ObservationDate' ] = pd.to_datetime(data[ 'ObservationDate' ]) # setting index data = data.set_index( 'ObservationDate' ) # indexing and slicing through the dataframe print (data.loc[ '2020-01-22' : '2020-02-22' ]) |
Output:
Unnamed: 0 Province/State ... Deaths Recovered ObservationDate ... 2020-01-22 0 Anhui ... 0.0 0.0 2020-01-22 1 Beijing ... 0.0 0.0 2020-01-22 2 Chongqing ... 0.0 0.0 2020-01-22 3 Fujian ... 0.0 0.0 2020-01-22 4 Gansu ... 0.0 0.0 ... ... ... ... ... ... 2020-02-22 2169 San Antonio, TX ... 0.0 0.0 2020-02-22 2170 Seattle, WA ... 0.0 1.0 2020-02-22 2171 Tempe, AZ ... 0.0 0.0 2020-02-22 2172 Unknown ... 0.0 0.0 2020-02-22 2173 NaN ... 0.0 0.0 [2174 rows x 7 columns]
Manipulating Time Series Data in Python
A collection of observations (activity) for a single subject (entity) at various time intervals is known as time-series data. In the case of metrics, time series are equally spaced and in the case of events, time series are unequally spaced. We may add the date and time for each record in this Pandas module, as well as fetch dataframe records and discover data inside a specific date and time range.
Contact Us