Filtering row which contains a certain string using dplyr

1.9K    Asked by LaunaKirchner in Data Science , Asked on Jul 15, 2021

 I have to filter a data frame using as criterion those row in which is contained the string RTB. I'm using dplyr.

d.del <- df %.%
  group_by(TrackingPixel) %.%
  summarise(MonthDelivery = as.integer(sum(Revenue))) %.%
  arrange(desc(MonthDelivery))

I know I can use the function filter in dplyr but I don't exactly know how to tell it to check for the content of a string.

In particular, I want to check the content in the column TrackingPixel. If the string contains the label RTB I want to remove the row from the result.

Answered by James WILLIAMS

To solve dplyr filters with a specific label, you can use the grepl() function that is used to match a pattern, inside the filter function.

The basic syntax is given below:

grepl(pattern, x, ignore.case = FALSE, perl = FALSE,
      fixed = FALSE, useBytes = FALSE)
In your case, use the following:
library(dplyr)  

filter(df, !grepl("RTB",TrackingPixel))For example:

To filter rows containing “Mazda” in a string in mtcars data set:

data(mtcars)
mtcars$names <- rownames(mtcars) 

filter(mtcars, grepl('Mazda',names))Output:

  mpg cyl disp hp drat wt qsec vs am gear carb  names
1 21 6 160 110 3.9 2.620 16.46 0 1 4 4 Mazda RX4
2 21 6 160 110 3.9 2.875 17.02 0 1 4 4 Mazda RX4 Wag
Similarly, to not include rows containing “Mazda”, do the following:
filter(mtcars, !grepl('Mazda',names))

Your Answer

Answer (1)

In R, particularly when using the dplyr package, you can filter rows based on whether a certain string is present in a column. Here’s how you can achieve this:

Example Data

Let's assume you have a data frame called df with columns ID and Text. Here’s a sample data frame:

  library(dplyr)# Sample datadf &lt;- data.frame(  ID = c(1, 2, 3, 4, 5),  Text = c("apple", "banana", "orange", "grape", "kiwi"))

Output:

mathematica

    ID   Text1  1  apple2  2 banana3  3 orange4  4  grape5  5   kiwi

Filtering Rows Containing a Certain String

  Now, let's filter rows where the Text column contains the string "apple".# Filter rows containing "apple" in the Text columnfiltered_df &lt;- df %>%  filter(grepl("apple", Text, ignore.case = TRUE))print(filtered_df)Output:mathematicaCopy code  ID  Text1  1 appleExplanationfilter(grepl("apple", Text, ignore.case = TRUE)):grepl("apple", Text, ignore.case = TRUE) checks if the string "apple" is present in each element of the Text column (ignore.case = TRUE makes the search case-insensitive).

filter() function from dplyr filters rows based on the condition provided inside it.

Notes:

Case Sensitivity: The grepl() function used inside filter() checks for a match without considering case due to ignore.case = TRUE. If you need case-sensitive matching, omit ignore.case = TRUE.

Multiple Matches: If you want to filter based on multiple strings or more complex conditions, you can modify the grepl() call accordingly within the filter() function.

This approach allows you to filter rows in your data frame based on whether a certain string (or pattern) exists within a specific column using dplyr in R. Adjust the string and column names as per your actual data structure and filtering needs.

6 Months

Interviews

Parent Categories