Data Structures

A data structure is a particular way of organizing data in a computer so that it can be used effectively. 

Vectors:

Vectors in R are the same as the arrays in C language which are used to hold multiple data values of the same type. One major key point is that in R the indexing of the vector will start from ‘1’ and not from ‘0’.

 

 

Example:

R




# R program to illustrate Vector
 
# Numeric Vector
N = c(1, 3, 5, 7, 8)
 
# Character vector
C = c('Geeks', 'For', 'Geeks')
 
# Logical Vector
L = c(TRUE, FALSE, FALSE, TRUE)
 
# Printing vectors
print(N)
print(C)
print(L)


 Output:

[1] 1 3 5 7 8
[1] "Geeks" "For"   "Geeks"
[1]  TRUE FALSE FALSE  TRUE

Accessing Vector Elements: 

There are many ways through which we can access the elements of the vector. The most common is using the ‘[]’, symbol.

Example:

R




# Accessing elements using
# the position number.
X <- c(2, 9, 8, 0, 5)
print('using Subscript operator')
print(X[2])
 
# Accessing specific values by passing
# a vector inside another vector.
Y <- c(6, 2, 7, 4, 0)
print('using c function')
print(Y[c(4, 1)])
 
# Logical indexing
Z <- c(1, 6, 9, 4, 6)
print('Logical indexing')
print(Z[Z>3])


 Output:

[1] "using Subscript operator"
[1] 9
[1] "using c function"
[1] 4 6
[1] "Logical indexing"
[1] 6 9 4 6

Refer to the below articles to get detailed information about vectors in R.

Lists:

A list is a generic object consisting of an ordered collection of objects. Lists are heterogeneous data structures.

Example: 

R




# R program to create a List
 
# The first attributes is a numeric vector
# containing the employee IDs which is created
# using the command here
empId = c(1, 2, 3, 4)
 
# The second attribute is the employee name
# which is created using this line of code here
# which is the character vector
empName = c("Nisha", "Nikhil", "Akshu", "Sambha")
 
# The third attribute is the number of employees
# which is a single numeric variable.
numberOfEmp = 4
 
# The fourth attribute is the name of organization
# which is a single character variable.
Organization = "GFG"
 
# We can combine all these three different
# data types into a list
# containing the details of employees
# which can be done using a list command
empList = list(empId, empName, numberOfEmp, Organization)
 
print(empList)


 Output: 

[[1]]
[1] 1 2 3 4

[[2]]
[1] "Nisha"  "Nikhil" "Akshu"  "Sambha"

[[3]]
[1] 4

[[4]]
[1] "GFG"

Accessing List Elements:

  • Access components by names: All the components of a list can be named and we can use those names to access the components of the list using the dollar command.
  • Access components by indices: We can also access the components of the list using indices. To access the top-level components of a list we have to use a double slicing operator “[[ ]]” which is two square brackets and if we want to access the lower or inner level components of a list we have to use another square bracket “[ ]” along with the double slicing operator “[[ ]]“.

Example: 

R




# R program to access
# components of a list
 
# Creating a list by naming all its components
empId = c(1, 2, 3, 4)
empName = c("Nisha", "Nikhil", "Akshu", "Sambha")
numberOfEmp = 4
empList = list(
"ID" = empId,
"Names" = empName,
"Total Staff" = numberOfEmp
)
print("Initial List")
print(empList)
 
# Accessing components by names
cat("\nAccessing name components using $ command\n")
print(empList$Names)
 
# Accessing a top level components by indices
cat("\nAccessing name components using indices\n")
print(empList[[2]])
print(empList[[1]][2])
print(empList[[2]][4])


 Output:

[1] "Initial List"
$ID
[1] 1 2 3 4

$Names
[1] "Nisha"  "Nikhil" "Akshu"  "Sambha"

$`Total Staff`
[1] 4


Accessing name components using $ command
[1] "Nisha"  "Nikhil" "Akshu"  "Sambha"

Accessing name components using indices
[1] "Nisha"  "Nikhil" "Akshu"  "Sambha"
[1] 2
[1] "Sambha"

Adding and Modifying list elements:

  • A list can also be modified by accessing the components and replacing them with the ones which you want.
  • List elements can be added simply by assigning new values using new tags.

Example:

R




# R program to access
# components of a list
 
# Creating a list by naming all its components
empId = c(1, 2, 3, 4)
empName = c("Nisha", "Nikhil", "Akshu", "Sambha")
numberOfEmp = 4
empList = list(
"ID" = empId,
"Names" = empName,
"Total Staff" = numberOfEmp
)
print("Initial List")
print(empList)
 
# Adding new element
empList[["organization"]] <- "GFG"
cat("\nAfter adding new element\n")
print(empList)
 
# Modifying the top-level component
empList$"Total Staff" = 5
   
# Modifying inner level component
empList[[1]][5] = 7
 
cat("\nAfter modification\n")
print(empList)


 Output: 

[1] "Initial List"
$ID
[1] 1 2 3 4

$Names
[1] "Nisha"  "Nikhil" "Akshu"  "Sambha"

$`Total Staff`
[1] 4


After adding new element
$ID
[1] 1 2 3 4

$Names
[1] "Nisha"  "Nikhil" "Akshu"  "Sambha"

$`Total Staff`
[1] 4

$organization
[1] "GFG"


After modification
$ID
[1] 1 2 3 4 7

$Names
[1] "Nisha"  "Nikhil" "Akshu"  "Sambha"

$`Total Staff`
[1] 5

$organization
[1] "GFG"

Refer to the below articles to get detailed information about lists in R

Matrices:

A matrix is a rectangular arrangement of numbers in rows and columns. Matrices are two-dimensional, homogeneous data structures.

Example:

R




# R program to illustrate a matrix
 
A = matrix(
    # Taking sequence of elements
    c(1, 4, 5, 6, 3, 8),
 
    # No of rows and columns
    nrow = 2, ncol = 3,
 
    # By default matrices are
    # in column-wise order
    # So this parameter decides
    # how to arrange the matrix
    byrow = TRUE
)
 
print(A)


 Output:

     [,1] [,2] [,3]
[1,]    1    4    5
[2,]    6    3    8

Accessing Matrix Elements:

Matrix elements can be accessed using the matrix name followed by a square bracket with a comma in between the array. Value before the comma is used to access rows and value that is after the comma is used to access columns.

Example:

R




# R program to illustrate
# access rows in metrics
 
# Create a 3x3 matrix
A = matrix(
c(1, 4, 5, 6, 3, 8),
nrow = 2, ncol = 3,
byrow = TRUE       
)
cat("The 2x3 matrix:\n")
print(A)
 
print(A[1, 1]) 
print(A[2, 2])
 
# Accessing first and second row
cat("Accessing first and second row\n")
print(A[1:2, ])
 
# Accessing first and second column
cat("\nAccessing first and second column\n")
print(A[, 1:2])


 Output:

The 2x3 matrix:
     [,1] [,2] [,3]
[1,]    1    4    5
[2,]    6    3    8
[1] 1
[1] 3
Accessing first and second row
     [,1] [,2] [,3]
[1,]    1    4    5
[2,]    6    3    8

Accessing first and second column
     [,1] [,2]
[1,]    1    4
[2,]    6    3

 Modifying Matrix Elements:

You can modify the elements of the matrices by a direct assignment.

Example:

R




# R program to illustrate
# editing elements in metrics
 
# Create a 3x3 matrix
A = matrix(
    c(1, 4, 5, 6, 3, 8),
    nrow = 2,
    ncol = 3,
    byrow = TRUE
)
cat("The 2x3 matrix:\n")
print(A)
 
# Editing the 3rd rows and 3rd
# column element from 9 to 30
# by direct assignments
A[2, 1] = 30
 
cat("After edited the matrix\n")
print(A)


 Output:

The 2x3 matrix:
     [,1] [,2] [,3]
[1,]    1    4    5
[2,]    6    3    8
After edited the matrix
     [,1] [,2] [,3]
[1,]    1    4    5
[2,]   30    3    8

Refer to the below articles to get detailed information about Matrices in R

DataFrame:

Dataframes are generic data objects of R which are used to store the tabular data. They are two-dimensional, heterogeneous data structures. These are lists of vectors of equal lengths.

Example:

R




# R program to illustrate dataframe
 
# A vector which is a character vector
Name = c("Nisha", "Nikhil", "Raju")
 
# A vector which is a character vector
Language = c("R", "Python", "C")
 
# A vector which is a numeric vector
Age = c(40, 25, 10)
 
# To create dataframe use data.frame command
# and then pass each of the vectors
# we have created as arguments
# to the function data.frame()
df = data.frame(Name, Language, Age)
 
print(df)


 Output:

    Name Language Age
1  Nisha        R  40
2 Nikhil   Python  25
3   Raju        C  10

Getting the structure and data from DataFrame:

  • One can get the structure of the data frame using str() function.
  • One can extract a specific column from a data frame using its column name.

Example:

R




# R program to get the
# structure of the data frame
 
# creating a data frame
friend.data <- data.frame(
    friend_id = c(1:5),
    friend_name = c("Aman", "Nisha",
                    "Nikhil", "Raju",
                    "Raj"),
    stringsAsFactors = FALSE
)
# using str()
print(str(friend.data))
 
# Extracting friend_name column
result <- data.frame(friend.data$friend_name)
print(result)


 
 Output:

'data.frame':    5 obs. of  2 variables:
 $ friend_id  : int  1 2 3 4 5
 $ friend_name: chr  "Aman" "Nisha" "Nikhil" "Raju" ...
NULL
  friend.data.friend_name
1                    Aman
2                   Nisha
3                  Nikhil
4                    Raju
5                     Raj

Summary of dataframe:

The statistical summary and nature of the data can be obtained by applying summary() function.

Example:

R




# R program to get the
# structure of the data frame
 
# creating a data frame
friend.data <- data.frame(
    friend_id = c(1:5),
    friend_name = c("Aman", "Nisha",
                    "Nikhil", "Raju",
                    "Raj"),
    stringsAsFactors = FALSE
)
# using summary()
print(summary(friend.data))


 
 Output:

   friend_id friend_name       
 Min.   :1   Length:5          
 1st Qu.:2   Class :character  
 Median :3   Mode  :character  
 Mean   :3                     
 3rd Qu.:4                     
 Max.   :5                     

Refer to the below articles to get detailed information about DataFrames in R

Arrays:

Arrays are the R data objects which store the data in more than two dimensions. Arrays are n-dimensional data structures.

Example:

R




# R program to illustrate an array
 
A = array(
    # Taking sequence of elements
    c(2, 4, 5, 7, 1, 8, 9, 2),
 
    # Creating two rectangular matrices
    # each with two rows and two columns
    dim = c(2, 2, 2)
)
 
print(A)


 Output:

, , 1

     [,1] [,2]
[1,]    2    5
[2,]    4    7

, , 2

     [,1] [,2]
[1,]    1    9
[2,]    8    2

Accessing arrays:

The arrays can be accessed by using indices for different dimensions separated by commas. Different components can be specified by any combination of elements’ names or positions.

Example:

R




vec1 <- c(2, 4, 5, 7, 1, 8, 9, 2)
vec2 <- c(12, 21, 34)
 
row_names <- c("row1", "row2")
col_names <- c("col1", "col2", "col3")
mat_names <- c("Mat1", "Mat2")
 
arr = array(c(vec1, vec2), dim = c(2, 3, 2),
            dimnames = list(row_names,
                            col_names, mat_names))
 
# accessing matrix 1 by index value
print ("Matrix 1")
print (arr[,,1])
 
# accessing matrix 2 by its name
print ("Matrix 2")
print(arr[,,"Mat2"])
 
# accessing matrix 1 by index value
print ("1st column of matrix 1")
print (arr[, 1, 1])
   
# accessing matrix 2 by its name
print ("2nd row of matrix 2")
print(arr["row2",,"Mat2"])
 
# accessing matrix 1 by index value
print ("2nd row 3rd column matrix 1 element")
print (arr[2, "col3", 1])
   
# accessing matrix 2 by its name
print ("2nd row 1st column element of matrix 2")
print(arr["row2", "col1", "Mat2"])
 
# print elements of both the rows and columns
# 2 and 3 of matrix 1
print (arr[, c(2, 3), 1])


 Output:

[1] "Matrix 1"
     col1 col2 col3
row1    2    5    1
row2    4    7    8
[1] "Matrix 2"
     col1 col2 col3
row1    9   12   34
row2    2   21    2
[1] "1st column of matrix 1"
row1 row2 
   2    4 
[1] "2nd row of matrix 2"
col1 col2 col3 
   2   21    2 
[1] "2nd row 3rd column matrix 1 element"
[1] 8
[1] "2nd row 1st column element of matrix 2"
[1] 2
     col2 col3
row1    5    1
row2    7    8

Adding elements to array:

Elements can be appended at the different positions in the array. The sequence of elements is retained in order of their addition to the array. There are various in-built functions available in R to add new values:

  • c(vector, values)
  • append(vector, values):
  • Using the length function of the array

Example:

R




# creating a uni-dimensional array
x <- c(1, 2, 3, 4, 5)
 
# addition of element using c() function
x <- c(x, 6)
print ("Array after 1st modification ")
print (x)
 
# addition of element using append function
x <- append(x, 7)
print ("Array after 2nd modification ")
print (x)
 
# adding elements after computing the length
len <- length(x)
x[len + 1] <- 8
print ("Array after 3rd modification ")
print (x)
 
# adding on length + 3 index
x[len + 3]<-9
print ("Array after 4th modification ")
print (x)
 
# append a vector of values to the
# array after length + 3 of array
print ("Array after 5th modification")
x <- append(x, c(10, 11, 12), after = length(x)+3)
print (x)
 
# adds new elements after 3rd index
print ("Array after 6th modification")
x <- append(x, c(-1, -1), after = 3)
print (x)


 Output:

[1] "Array after 1st modification "
[1] 1 2 3 4 5 6
[1] "Array after 2nd modification "
[1] 1 2 3 4 5 6 7
[1] "Array after 3rd modification "
[1] 1 2 3 4 5 6 7 8
[1] "Array after 4th modification "
 [1]  1  2  3  4  5  6  7  8 NA  9
[1] "Array after 5th modification"
 [1]  1  2  3  4  5  6  7  8 NA  9 10 11 12
[1] "Array after 6th modification"
 [1]  1  2  3 -1 -1  4  5  6  7  8 NA  9 10 11 12

Removing Elements from Array:

  • Elements can be removed from arrays in R, either one at a time or multiple together. These elements are specified as indexes to the array, wherein the array values satisfying the conditions are retained and rest removed.
  • Another way to remove elements is by using %in% operator wherein the set of element values belonging to the TRUE values of the operator are displayed as result and the rest are removed.

Example:

R




# creating an array of length 9
m <- c(1, 2, 3, 4, 5, 6, 7, 8, 9)
print ("Original Array")
print (m)
 
# remove a single value element:3
# from array
m <- m[m != 3]
print ("After 1st modification")
print (m)
 
# removing elements based on condition
# where either element should be
# greater than 2 and less than equal
# to 8
m <- m[m>2 & m<= 8]
print ("After 2nd modification")
print (m)
 
# remove sequence of elements using
# another array
remove <- c(4, 6, 8)
 
# check which element satisfies the
# remove property
print (m % in % remove)
print ("After 3rd modification")
print (m [! m % in % remove])


 Output:

[1] "Original Array"
[1] 1 2 3 4 5 6 7 8 9
[1] "After 1st modification"
[1] 1 2 4 5 6 7 8 9
[1] "After 2nd modification"
[1] 4 5 6 7 8
[1]  TRUE FALSE  TRUE FALSE  TRUE
[1] "After 3rd modification"
[1] 5 7

Refer to the below articles to get detailed information about arrays in R.

Factors:

Factors are the data objects which are used to categorize the data and store it as levels. They are useful for storing categorical data.

Example:

R




# Creating a vector
x<-c("female", "male", "other", "female", "other")
 
# Converting the vector x into
# a factor named gender
gender<-factor(x)
print(gender)


 Output: 

[1] female male   other  female other 
Levels: female male other

Accessing elements of a Factor:

Like we access elements of a vector, the same way we access the elements of a factor
 

Example:

R




x<-c("female", "male", "other", "female", "other")
print(x[3])


 Output:

[1] "other"

Modifying of a Factor:

After a factor is formed, its components can be modified but the new values which need to be assigned must be in the predefined level.

Example:

R




x<-c("female", "male", "other", "female", "other")
x[1]<-"male"
print(x)


Output:

[1] "male"   "male"   "other"  "female" "other" 

Refer to the below articles to get detailed information Factors.

Learn R Programming

R is a Programming Language that is mostly used for machine learning, data analysis, and statistical computing. It is an interpreted language and is platform independent that means it can be used on platforms like Windows, Linux, and macOS.

In this R Language tutorial, we will Learn R Programming Language from scratch to advance and this tutorial is suitable for both beginners and experienced developers).

Similar Reads

Why Learn R Programming Language?

R programming is used as a leading tool for machine learning, statistics, and data analysis. R is an open-source language that means it is free of cost and anyone from any organization can install it without purchasing a license. It is available across widely used platforms like windows, Linux, and macOS. R programming language is not only a statistic package but also allows us to integrate with other languages (C, C++). Thus, you can easily interact with many data sources and statistical packages. Its user base is growing day by day and has vast community support. R Programming Language is currently one of the most requested programming languages in the Data Science job market that makes it the hottest trend nowadays....

Key Features and Applications

Some key features of R that make the R one of the most demanding job in data science market are:...

Download and Installation

There are many IDE’s available for using R in this article we will dealing with the installation of RStudio in R....

Hello World in R

R Program can be run in several ways. You can choose any of the following options to continue with this tutorial....

Fundamentals of R

...

Data Types

Variables:...

Basics of Input/Output

...

Decision Making

...

Control Flow

...

Loop Control Statements

...

Functions

...

Data Structures

...

Error Handling

Each variable in R has an associated data type. Each data type requires different amounts of memory and has some specific operations which can be performed over it. R supports 5 type of data types. These are –...

Charts and Graphs

...

Statistics

Taking Input from the User:...

Contact Us