How to useaggregate() method in R Language
Base R contains a large number of methods to perform operations on the dataframe. The seq() method in R is used to generate regular sequences beginning from a pre-defined value.
Syntax: seq(from , to , by , length.out)
Arguments :
- from – The value from where to begin the sequence. The as.Date() method is used here, in order to generate a sequence of dates until the length of the sequence is met.
- to – The value where to end the sequence.
- by – The parameter to increment the sequence. “day” is used as a parameter here, in order to generate successive dates in order.
- length.out – The total length of the sequence.
The dataframe is then formed using a sample from this date sequence generated as column 1. The value is generated using the rnorm() method to produce random floating-point numbers.
The strftime() method is then used to convert a time object to a character string back. The format can be specified to extract different components of the date object.
Syntax: strftime (date, format)
Arguments :
- date – The object to be converted
- format – We use %m to extract the month and %Y to extract the year in YYYY format.
In order to aggregate the data, the aggregate method is used, which is used to compute summary statistics of each of the groups.
Syntax: aggregate ( formula , data , FUN)
Arguments :
- formula – a formula, such as y ~ x
- data – The dataframe over which to apply the function
- FUN – The function to be applied to the dataframe. Here, the function applied is sum in order to perform the aggregation or summation over the values belonging to same group.
Code:
R
set.seed (99923) # creating dataframe # specifying number of rows len <- 100 # creating sequences of dates var_seq <- seq ( as.Date ( "2021/05/01" ), by = "day" , length.out = len) # creating columns for dataframe data_frame <- data.frame (col1 = sample ( var_seq, 100, replace = TRUE ), col2 = round ( rnorm (10, 5, 2), 2)) print ( "Original dataframe" ) head (data_frame) # creating new year column for dataframe data_frame$year_col <- strftime (data_frame$col1, "%Y" ) # creating new month column for dataframe data_frame$month_col <- strftime (data_frame$col1, "%m" ) # aggregating the daily data data_frame_mod <- aggregate (col2 ~ year_col + month_col, data_frame, FUN = sum) print ( "Modified dataframe" ) head (data_frame_mod) |
Output:
Aggregate Daily Data to Month and Year Intervals in R DataFrame
In this article, we are going to see how to aggregate daily data over a period of months and year intervals in dataframe in R Programming Language.
Contact Us