Types of Slicing Methods

There are various other slicing methods in dplyr package, that are available to cater to different needs, like selecting rows of a dataframe by index, choosing the first or last rows, extracting the minimum or maximum values from a column, or randomly sampling rows from a dataset.

Now, let us see each type in detail with an example. I will use the dataset called ‘mtcars’ which is available by default in R studio to demonstrate each slicing method.

1. slice(): Slices the dataframe by row index

This function is helpful to slice the dataframe by using the row indexes. We can either slice one row, rows in a range or even rows which are non-continuous, i.e, multiple rows. Below is the syntax for it.

one_row <- slice(df, n) # Slice nth row

rows_in_range <- slice(df, n1:n2) #Slice rows in range n1 to n2

multiple_rows <- slice(df, c(n1,n3,n6)) #Slice non-continuous rows using vector

R

#install and load package
install.packages('dplyr')
library(dplyr)
#load dataset to df variable
df <- mtcars 
# Slice nth row
one_row <- slice(df, 2)  
cat("The 2nd row is:\n")
print(one_row)
# Slice rows in range n1 to n2 
rows_in_range <- slice(df, 2:6) 
cat("The rows in the range of 2 and 6 are:\n")
print(rows_in_range)
 
multiple_rows <- slice(df, c(4,6,8))
cat("The 4th, 6th, 8th rows are:\n")
print(multiple_rows)

Output:

The 2nd row is:
              mpg cyl disp  hp drat    wt  qsec vs am gear carb
Mazda RX4 Wag  21   6  160 110  3.9 2.875 17.02  0  1    4    4
The rows in the range of 2 and 6 are:
                   mpg cyl disp  hp drat    wt  qsec vs am gear carb
Mazda RX4 Wag     21.0   6  160 110 3.90 2.875 17.02  0  1    4    4
Datsun 710        22.8   4  108  93 3.85 2.320 18.61  1  1    4    1
Hornet 4 Drive    21.4   6  258 110 3.08 3.215 19.44  1  0    3    1
Hornet Sportabout 18.7   8  360 175 3.15 3.440 17.02  0  0    3    2
Valiant           18.1   6  225 105 2.76 3.460 20.22  1  0    3    1
The 4th, 6th, 8th rows are:
                mpg cyl  disp  hp drat    wt  qsec vs am gear carb
Hornet 4 Drive 21.4   6 258.0 110 3.08 3.215 19.44  1  0    3    1
Valiant        18.1   6 225.0 105 2.76 3.460 20.22  1  0    3    1
Merc 240D      24.4   4 146.7  62 3.69 3.190 20.00  1  0    4    2

2. slice_head(): Select the top rows

This function helps us to get the top part of any dataframe. Here we can even specify how many rows in top we actually want to slice using an argument called ‘n’.

head_df <- slice_head(df, n = number) # Select the first n rows

print(head_df)

R

head_df <- slice_head(df,n=4) 
cat("The first 4 rows are: ")
print(head_df)

Output:

The first 4 rows are: 
                mpg cyl disp  hp drat    wt  qsec vs am gear carb
Mazda RX4      21.0   6  160 110 3.90 2.620 16.46  0  1    4    4
Mazda RX4 Wag  21.0   6  160 110 3.90 2.875 17.02  0  1    4    4
Datsun 710     22.8   4  108  93 3.85 2.320 18.61  1  1    4    1
Hornet 4 Drive 21.4   6  258 110 3.08 3.215 19.44  1  0    3    1

3. slice_tail(): Select the bottom rows

This function is similar to the above but is for the bottom part of the dataframe.

tail_df <- slice_tail(df, n = number) # Select the last rows

print(tail_df)

R

tail_df <- slice_tail(df,n=4)
cat("The last 4 rows are: ")
print(tail_df)

Output:

The last 4 rows are: 
                mpg cyl disp  hp drat   wt qsec vs am gear carb
Ford Pantera L 15.8   8  351 264 4.22 3.17 14.5  0  1    5    4
Ferrari Dino   19.7   6  145 175 3.62 2.77 15.5  0  1    5    6
Maserati Bora  15.0   8  301 335 3.54 3.57 14.6  0  1    5    8
Volvo 142E     21.4   4  121 109 4.11 2.78 18.6  1  1    4    2

4. slice_min(): Select the minimum of a column

As the function specifies, it gets the rows with minimum values from the dataframe, where we can specify based on the order of which column we need to slice the dataframe.

min_df <- slice_min(df, order_by = B) # Select the row with the minimum value in column ‘B’

print(min_df)

R

# Select the row with the minimum value in column 'mpg'
min_df <- slice_min(df, order_by = mpg) 
cat("The row with the least mpg: ")
print(min_df)

Output:

The row with the lease mpg: 
                     mpg cyl disp  hp drat    wt  qsec vs am gear carb
Cadillac Fleetwood  10.4   8  472 205 2.93 5.250 17.98  0  0    3    4
Lincoln Continental 10.4   8  460 215 3.00 5.424 17.82  0  0    3    4

5. slice_max(): Select the maximum of a column

This function is opposite to the slice_min() function. This selects the rows with maximum values based on the order of one particular column.

max_df <- slice_max(df, order_by = B) # Select the row with the maximum value in column ‘B’

print(max_df)

R

# Select the row with the maximum value in column 'B'
max_df <- slice_max(df, order_by = disp)  
cat("The row with the maximum disp: ")
print(max_df)

Output:

The row with the maximum disp
                    mpg cyl disp  hp drat   wt  qsec vs am gear carb
Cadillac Fleetwood 10.4   8  472 205 2.93 5.25 17.98  0  0    3    4

6. slice_random(): Select random rows

As the term random says that this method slices random rows from the dataframe. Here, also a parameter called ‘n’ can be given to specify how many rows must be selected.

random_df <- slice_sample(df, n = number) # select n random rows

print(random_df)

R

random_df <- slice_sample(df, n = 3)  
cat("3 random rows are: ")
print(random_df)

Output:

3 random rows are: 
               mpg cyl  disp  hp drat    wt  qsec vs am gear carb
Mazda RX4     21.0   6 160.0 110 3.90 2.620 16.46  0  1    4    4
Honda Civic   30.4   4  75.7  52 4.93 1.615 18.52  1  1    4    2
Maserati Bora 15.0   8 301.0 335 3.54 3.570 14.60  0  1    5    8

More Examples on Slice()

Now, let us look at the examples of the slicing and where it is used in data analysis.

For suppose we wanted to find out the rows with a particular condition. Let’s say we have a dataframe of student names and their scores. And we need to get all the student names with their score whose score is above 85%. So, firstly let us create a dataframe, after which we are going to write the slice function to slice the dataframe based on the condition which is score>85.

R

# Create a dataframe
class_score <- data.frame(
  ID = 1:5,
  Name = c("Krishna", "Sony", "Priya", "Rahul", "Rama"),
  Score = c(85, 92, 78, 88, 95)
)
 
# Slice the dataframe based on condition
top_scorers <- class_score %>% slice(which(Score > 85))
 
# Print the top scorers
print(top_scorers)

Output:

  ID    Name Score
1  1 Krishna    85
2  2    Sony    92
3  3   Priya    78
4  4   Rahul    88
5  5    Rama    95
dataframe based on condition that score > 85
  ID  Name Score
1  2  Sony    92
2  4 Rahul    88
3  5  Rama    95

We created a dataframe called class_score with columns like id, names, and scores respectively. Then we used the pipe operator as discussed before which uses the class_score dataframe to slice from it and store in new variable. The parameter of the slice() function is which(). This part of code returns the indices where the condition Score > 85 is true. So, it returns the position of elements that are True. Hence, in this way we can use the slice function.

Example 2: Let us now take the example dataset of cricket teams and their scores. Our task is to find the top teams from the dataframe.

R

# cricket teams dataframe
cricket_data <- data.frame(
  Team = c("India", "Australia", "England", "Pakistan", "South Africa"),
  Score = c(320, 289, 275, 241, 305)
)
 
cricket_data 
# Arrange data in descending order of scores and then select the top 3 rows
top_scores <- cricket_data %>% arrange(desc(Score)) %>% slice(1:3)  
 
 
# Display the top_scores dataset
print("Top 3 Scores:")
print(top_scores)

Output:

          Team Score
1        India   320
2    Australia   289
3      England   275
4     Pakistan   241
5 South Africa   305
[1] "Top 3 Scores:"
          Team Score
1        India   320
2 South Africa   305
3    Australia   289

Here, we took a dataframe for the cricket teams and their scores and we tried to find the top scorers. Here also we used the pipe operator from dplyr. Firstly, we arranged the dataframe in decreasing order according to the column ‘Score’, after which we used the slice() function to get the top three scores.

Conclusion

The slice() function in dplyr package of R is really a powerful tool to extract specific rows according to our need from any dataframe based on their positions. It is really easy and simple to use function which can be mastered by anyone with practice. By mastering this function, data anlaysts and scientists can improve their data wrangling tasks to unlock deeper insights from the datasets.

Slice() From Dplyr In R

With so much data around us in today’s world, dealing with them becomes tough. In this case, the Dplyr data frame package from R acts as a lifesaver and that package stands out as a powerful and versatile tool. for data manipulation. In R Programming Language package has many functions and among them, slice() is particularly useful for extracting specific rows from any data frame based on their indexes (positions).

In this article, we will look at the details of this slice() function and explore how can it help in the data manipulation process.

Tags:

#Geeks Premier League 2023 #Geeks Premier League #R Language

Steps to implement Slicing in R

FAQs on Slice() function

Types of Slicing Methods

1. slice(): Slices the dataframe by row index

R

2. slice_head(): Select the top rows

R

3. slice_tail(): Select the bottom rows

R

4. slice_min(): Select the minimum of a column

R

5. slice_max(): Select the maximum of a column

R

6. slice_random(): Select random rows

R

More Examples on Slice()

R

R

Conclusion

Slice() From Dplyr In R

Similar Reads

Contact Us