SVM is not only applicable to classification but also to regression. The steps are mostly the same as in the case of classification. We just need very slight modification in the syntax.
1 2 3 |
##### Load Goodies ##### library(e1071) library(tidyverse) |
Again, we will use the diamonds dataset as we did in the classification.
1 2 3 4 5 6 7 8 |
##### Load Data ##### data <- diamonds ##### glimpse ##### glimpse(data) ##### Subset ##### data <- data[1:500,] |
If we’d like to use the default settings, we can use svm() .
1 2 3 4 5 6 7 8 9 10 |
##### Fit ##### svm_1 <- svm(price ~ . , data = data #kernel = 'radial', #cost = 1, #gamma = 1, #epsilon = 1) ) summary(svm_1) |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 |
> summary(svm_1) Call: svm(formula = price ~ ., data = data) Parameters: SVM-Type: eps-regression SVM-Kernel: radial cost: 1 gamma: 0.04166667 epsilon: 0.1 Number of Support Vectors: 103 |
But if we would like to run the combination of the variables, we then can use tune() as we did in classification. The only small difference is that we also need to add an epsilon value as it is used in the regression calculation.
1 2 3 4 5 6 7 8 9 |
##### Tune ##### set.seed(1) svm_tune <- tune(svm, price ~ ., data = data, kernel = "radial", ranges =list(cost=seq(0.1,1,0.1), gamma=seq(0.1,1,0.1), epsilon=seq(0.1,1,0.1))) summary(svm_tune) |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 |
> summary(svm_tune) Parameter tuning of ‘svm’: - sampling method: 10-fold cross validation - best parameters: cost gamma epsilon 1 0.1 0.1 - best performance: 13669.51 - Detailed performance results: cost gamma epsilon error dispersion 1 0.1 0.1 0.1 32058.04 14952.548 2 0.2 0.1 0.1 21044.66 9624.365 3 0.3 0.1 0.1 17604.88 8021.423 4 0.4 0.1 0.1 16243.15 7311.850 |
1 |
svm_tune$best.model |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 |
> svm_tune$best.model Call: best.tune(method = svm, train.x = price ~ ., data = data, ranges = list(cost = seq(0.1, 1, 0.1), gamma = seq(0.1, 1, 0.1), epsilon = seq(0.1, 1, 0.1)), kernel = "radial") Parameters: SVM-Type: eps-regression SVM-Kernel: radial cost: 1 gamma: 0.1 epsilon: 0.1 Number of Support Vectors: 97 |
Next, we need to fit a separate model from svm() as predict() is not applicable to the tune class.
1 2 3 4 |
##### Refit ##### svm_2 <- svm(price ~ ., data = data, cost = 1, gamma = 0.1, epsilon = 0.1, kernel = 'radial') summary(predict(svm_2, data)) |
1 2 3 |
> summary(predict(svm_2, data)) Min. 1st Qu. Median Mean 3rd Qu. Max. 277 2661 2749 2225 2799 2963 |