A user is trying to run a parallel kNN program on R but he gets this error: Error in { : task 1 failed - "could not find function "knn""
library(class)
library(doSNOW)
library(foreach)
train <- read.csv('train.csv')
test <- read.csv('test.csv')
trainY <- read.csv('trainY.csv')
cl <- as.vector(as.matrix(trainY))
system.time(summary(knn(train, test, cl, k=25, prob = TRUE)))
clus <- makeCluster(4)
registerDoSNOW(clus)
countrows=nrow(test)
system.time(foreach( icount(countrows) ) %dopar% {
summary(knn(train, test, cl, k=25, prob = TRUE))
})
stopCluster(clus)
In such case, we need to call library(class) on each of the nodes. foreach makes this easy via the .packages argument.
system.time(foreach( icount(countrows), .packages="class" ) %dopar% {
summary(knn(train, test, cl, k=25, prob = TRUE))
})
Also we might need to export train, test and cl.
system.time(
foreach( icount(countrows), .packages="class",
.export=c("train","test","cl") ) %dopar% {
summary(knn(train, test, cl, k=25, prob = TRUE))
}
)