A user was going to fit a knn model with faithful data in R. The code is given below:

1.8K    Asked by SanjanaShah in Data Science , Asked on Dec 20, 2019
Answered by Nitin Solanki

smp_size <- floor(0.5 * nrow(faithful))

set.seed(123)

train_ind <- sample(seq_len(nrow(faithful)), size = smp_size)

train_data = faithful[train_ind, ]

test_data = faithful[-train_ind, ]

pred = FNN::knn.reg(train = train_data[,1],

                  test = test_data[,1],

                  y = train_data[,2], k = 5)$pred

The faithful data only has 2 columns. The error is "Error in get.knnx(train, test, k, algorithm) : Number of columns must be same!." How to fix that?

If we check knn.reg, it says that train or test data has to be a dataframe or a matrix. But in our case, we have only one independent variable so when we do str(train_data[,1]) , it is no more a dataframe.

In such case, we can use as.data.frame with train & test parameters in knn.reg.

Also we can normalize our data before running our KNN model. Below is the implementation of the code.

library('FNN')

set.seed(123)

#normalize data

X = scale(faithful[, -ncol(faithful)])

y = faithful[, ncol(faithful)]

#split data into train & test

train_ind <- sample(seq_len(nrow(faithful)), floor(0.7 * nrow(faithful)))

test_ind <- setdiff(seq_len(nrow(faithful)), train_ind)

#run KNN model

knn_model <- knn.reg(train = as.data.frame(X[train_ind,]),

                     test = as.data.frame(X[test_ind,]),

                     y = y[train_ind],

                     k = 5)

pred = knn_model$pred



Your Answer

Interviews

Parent Categories