A user was going to fit a knn model with faithful data in R. The code is given below:
smp_size <- floor(0.5 * nrow(faithful))
set.seed(123)
train_ind <- sample(seq_len(nrow(faithful)), size = smp_size)
train_data = faithful[train_ind, ]
test_data = faithful[-train_ind, ]
pred = FNN::knn.reg(train = train_data[,1],
test = test_data[,1],
y = train_data[,2], k = 5)$pred
The faithful data only has 2 columns. The error is "Error in get.knnx(train, test, k, algorithm) : Number of columns must be same!." How to fix that?
If we check knn.reg, it says that train or test data has to be a dataframe or a matrix. But in our case, we have only one independent variable so when we do str(train_data[,1]) , it is no more a dataframe.
In such case, we can use as.data.frame with train & test parameters in knn.reg.
Also we can normalize our data before running our KNN model. Below is the implementation of the code.
library('FNN')
set.seed(123)
#normalize data
X = scale(faithful[, -ncol(faithful)])
y = faithful[, ncol(faithful)]
#split data into train & test
train_ind <- sample(seq_len(nrow(faithful)), floor(0.7 * nrow(faithful)))
test_ind <- setdiff(seq_len(nrow(faithful)), train_ind)
#run KNN model
knn_model <- knn.reg(train = as.data.frame(X[train_ind,]),
test = as.data.frame(X[test_ind,]),
y = y[train_ind],
k = 5)
pred = knn_model$pred