Omit rows containing specific column of NA
I want to know how to omit NA values in a data frame, but only in some columns, I am interested in.
For example,
DF <- data.frame(x = c(1, 2, 3), y = c(0, 10, NA), z=c(NA, 33, 22))
but I only want to omit the data where y is NA, therefore the result should be
x y z
1 1 0 NA
2 2 10 33
na.omit seems delete all rows contain any NA.
Can somebody help me out of this simple question?
But if now I change the question like:
DF <- data.frame(x = c(1, 2, 3,NA), y = c(1,0, 10, NA), z=c(43,NA, 33, NA))
If I want to omit only x=na or z=na, where can I put the | in function?
To solve r remove rows with na in one column, you can use the following methods:
Using is.na() function:
DF <- data.frame(x = c(1, 2, 3), y = c(0, 10, NA), z=c(NA, 33, 22))
> DF
x y z
1 1 0 NA
2 2 10 33
3 3 NA 22
> DF[!is.na(DF$y),]
x y z
1 1 0 NA
2 2 10 33
Using drop_na function from tidyr package:
library(tidyr)
DF %>% drop_na(y)
x y z
1 1 0 NA
2 2 10 33
Using complete.cases :
DF[complete.cases(DF[, "y"]),]
x y z
1 1 0 NA
2 2 10 33