Explain with a case study how to implement unsupervised learning algorithm using R.
For unsupervised learning, we will be performing clustering algorithms such as K-Means algorithm.
First let us read the data
crime <- read.csv(file.choose())
Now we will normalize the data using scale function
# Normalizing continuous columns to bring them under same scale
normalized_data<-scale(crime[,2:5]) #excluding the X name column before normalizing
View(normalized_data)
wss = NULL
Now we will create a function to create K-Means algorithm
twss <- NULL
for (i in 2:15){
twss <- c(twss,kmeans(normalized_data,i)$tot.withinss)
}
Now we will plot the clusters and the value of k will be decided on the elbow spotted on the plot
plot(2:15, twss,type="b", xlab="Number of Clusters", ylab="Within groups sum of squares") # Look for an "elbow" in the scree plot #
title(sub = "K-Means Clustering Scree-Plot")
Now we will select the k value from the scree plot
k_3 <- kmeans(normalized_data,3)
str(k_3)
clust=k_3$cluster
Now we will create the aggregates of clusters and put into a dataframe.
final=data.frame(crime,clust)
aggregate(crime[,-1],by=list(final$clust),mean)