Filtering row which contains a certain string using dplyr
I have to filter a data frame using as criterion those row in which is contained the string RTB. I'm using dplyr.
d.del <- df %.%
group_by(TrackingPixel) %.%
summarise(MonthDelivery = as.integer(sum(Revenue))) %.%
arrange(desc(MonthDelivery))
I know I can use the function filter in dplyr but I don't exactly know how to tell it to check for the content of a string.
In particular, I want to check the content in the column TrackingPixel. If the string contains the label RTB I want to remove the row from the result.
To solve dplyr filters with a specific label, you can use the grepl() function that is used to match a pattern, inside the filter function.
The basic syntax is given below:
grepl(pattern, x, ignore.case = FALSE, perl = FALSE,
fixed = FALSE, useBytes = FALSE)
In your case, use the following:
library(dplyr)
filter(df, !grepl("RTB",TrackingPixel))For example:
To filter rows containing “Mazda” in a string in mtcars data set:
data(mtcars)
mtcars$names <- rownames(mtcars)
filter(mtcars, grepl('Mazda',names))Output:
mpg cyl disp hp drat wt qsec vs am gear carb names
1 21 6 160 110 3.9 2.620 16.46 0 1 4 4 Mazda RX4
2 21 6 160 110 3.9 2.875 17.02 0 1 4 4 Mazda RX4 Wag
Similarly, to not include rows containing “Mazda”, do the following:
filter(mtcars, !grepl('Mazda',names))