How to create, index and modify Data Frame in R?
In this article, we will discuss how to create a Data frame, index, and modify the data frame in the R programming language.
Creating a Data Frame:
A Data Frame is a two-dimensional labeled data structure. It may consist of fields/columns of different types. It simply looks like a table in SQL or like an excel worksheet. In R, to create a Data Frame use data.frame() method. The syntax to create a data frame is given as-
data <- data.frame(columnName1=c( data1,data2,...), ........... columnNameN=c(data1,data2,...))
Example:
In this example let’s look into how to create a Data Frame in R using data.frame() method.
R
# create a data frame stats <- data.frame (player= c ( 'A' , 'B' , 'C' , 'D' ), runs= c (100, 200, 408, NA ), wickets= c (17, 20, NA , 5)) print ( "stats Dataframe" ) stats |
Output
"stats Dataframe" player runs wickets 1 A 100 17 2 B 200 20 3 C 408 NA 4 D NA 5
Indexing the Data Frame:
To access the particular data in the Data Frame use square brackets and specify the column name or row numbers, and column numbers to fetch. Let’s look into the syntaxes of different ways of indexing a data frame.
# fetching the data in particular column data["columnName"] # fetching data of specified rows and # columns data[ fromRow : toRow , columnNumber] # fetches first row to third row # and second column Eg:- data[1:3,2]
Example:
In the below code we created a data frame and performed indexing on it by fetching the data in the specified rows and particular columns.
R
# create a data frame stats <- data.frame (player= c ( 'A' , 'B' , 'C' , 'D' ), runs= c (100, 200, 408, NA ), wickets= c (17, 20, NA , 5)) print ( "stats Dataframe" ) stats # fetch data in certain column stats[ "player" ] print ( "----------" ) # fetch certain rows and columns stats[1:3,2] |
Output
"stats Dataframe" player runs wickets 1 A 100 17 2 B 200 20 3 C 408 NA 4 D NA 5 ---------- player 1 A 2 B 3 C 4 D ---------- 100 200 408
Modify the Data Frame:
Data Modification in a Data Frame
To modify the data in a data frame, we use indexing and reassignment techniques. Let’s look into the syntax of how to modify the data in a data frame.
data[rowNumber, columnName] <- “newValue”
Adding a row to a Data Frame
To add a row in the data frame use rbind() function which accepts two parameters. One is a data frame and the other is the row we need to insert as a list of elements. The syntax of rbind is given below-
rbind( dataframeName, list( data1, data2, …))
Adding a column to a Data Frame
To add a column to a data frame use cbind() function which accepts two parameters. One is a data frame to which we add a new column and the other is data in the new column with the column name. Below is the syntax of cbind() function.
cbind( dataframeName, columnName = c(data1, data2, …))
Removing a row and column from a Data Frame
To remove a row and column from a data frame using the below syntax
# remove row from a dataframe # deletes the row of specified row number dataframeName <- dataframeName[-rowNumber,] # remove column from a dataframe dataframeName$columnName <- NULL
Example:
In the example, we created a data frame and performed modification operations like insertion, deletion, and modification on the Dataframe.
R
# create a data frame stats <- data.frame (player= c ( 'A' , 'B' , 'C' , 'D' ), runs= c (100, 200, 408, NA ), wickets= c (17, 20, NA , 5)) cat ( "stats Dataframe\n" ) stats # modify the data stats[4, "runs" ] <- 274 cat ( "\nModified dataframe\n" ) stats # added new row cat ( "\nDataFrame after a row insertion\n" ) stats<- rbind (stats, list ( 'E' ,500,1)) print (stats) # added new column cat ( "\nDataFrame after a new column insertion\n" ) stats<- cbind (stats,matches= c (2,3,10,2,12)) print (stats) # deleted the second row stats<-stats[-2,] # deleted the wickets column stats$wickets<- NULL cat ( "\nDataframe after deletion of a row & column\n" ) stats |
Output
stats Dataframe player runs wickets 1 A 100 17 2 B 200 20 3 C 408 NA 4 D NA 5 Modified dataframe player runs wickets 1 A 100 17 2 B 200 20 3 C 408 NA 4 D 274 5 DataFrame after a row insertion player runs wickets 1 A 100 17 2 B 200 20 3 C 408 NA 4 D 274 5 5 E 500 1 DataFrame after a new column insertion player runs wickets matches 1 A 100 17 2 2 B 200 20 3 3 C 408 NA 10 4 D 274 5 2 5 E 500 1 12 Dataframe after deletion of a row & column player runs matches 1 A 100 2 3 C 408 10 4 D 274 2 5 E 500 12
Contact Us