A user built a decision tree in R with tree package using the below code

886    Asked by varshaChauhan in Data Science , Asked on Nov 5, 2019
Answered by varsha Chauhan

Classification tree:

tree(formula = High temperature ~ ., data = summer.train)

Variables actually used in tree construction:

[1] "Humidity" "Cloudy" "Airy" "Dry"


Number of terminal nodes: 12

Residual mean deviance: 0.3874 = 377.7 / 975

Misclassification error rate: 0.08909 = 89 / 999

Now how to get the variables that are used by the tree construction, "airy", "dry", etc based on the summary function above?

Let us use the famous spam dataset to find out the solution




spam_tree_def <- tree(type~.,data=spam)


The summary result gives the following

Classification tree:

tree(formula = type ~ ., data = spam)

Variables actually used in tree construction:

 [1] "charDollar" "remove" "charExclamation" "hp" "capitalLong" "our"

 [7] "capitalAve" "free" "george" "edu"

Number of terminal nodes: 13

Residual mean deviance: 0.4879 = 2238 / 4588

Misclassification error rate: 0.08259 = 380 / 4601

The correct way to extract what we want is


[1] "charDollar" "remove" "charExclamation" "hp" "capitalLong" "our"

 [7] "capitalAve" "free" "george" "edu"

Your Answer


Parent Categories