Docsity
Docsity

Prepare for your exams
Prepare for your exams

Study with the several resources on Docsity


Earn points to download
Earn points to download

Earn points by helping other students or get them with a premium plan


Guidelines and tips
Guidelines and tips

ISLR Chapter 9 Support Vector Machines Lab Manual, Exercises of Statistics

Introduction to Statistical Learning (James/Witten/Hastie/Tibshirani)

Typology: Exercises

2020/2021

Uploaded on 05/26/2021

ekapad
ekapad 🇮🇳

5

(17)

266 documents

1 / 6

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
Chapter 9 - Support Vector Machines
Lab Solution
1 Problem 8
(a). Create a training set containing a random sample of 800 observations, and a test set containing
the remaining observations.
library(ISLR)
set.seed(1)
train =sample(dim(OJ)[1], 800)
OJ.train =OJ[train, ]
OJ.test =OJ[-train, ]
(b). Fit a support vector classifier to the training data using cost=0.01, with Purchase as the re-
sponse and the other variables as predictors. Use the summary() function to produce summary
statistics, and describe the results obtained.
library(e1071,quietly = T)
svm.linear =svm(Purchase ~., kernel ="linear",data = OJ.train, cost =0.01)
summary(svm.linear)
##
## Call:
## svm(formula = Purchase ~ ., data = OJ.train, kernel = "linear", cost = 0.01)
##
##
## Parameters:
## SVM-Type: C-classification
## SVM-Kernel: linear
## cost: 0.01
## gamma: 0.05555556
##
## Number of Support Vectors: 432
##
## ( 215 217 )
##
##
## Number of Classes: 2
##
## Levels:
## CH MM
(c). What are the training and test error rates?
1
pf3
pf4
pf5

Partial preview of the text

Download ISLR Chapter 9 Support Vector Machines Lab Manual and more Exercises Statistics in PDF only on Docsity!

Chapter 9 - Support Vector Machines

Lab Solution

1 Problem 8

(a). Create a training set containing a random sample of 800 observations, and a test set containing the remaining observations.

library (ISLR) set.seed (1) train = sample ( dim (OJ)[1], 800) OJ.train = OJ[train, ] OJ.test = OJ[-train, ]

(b). Fit a support vector classifier to the training data using cost=0.01, with Purchase as the re- sponse and the other variables as predictors. Use the summary() function to produce summary statistics, and describe the results obtained.

library (e1071,quietly = T) svm.linear = svm (Purchase ~ ., kernel = "linear", data = OJ.train, cost = 0.01) summary (svm.linear)

Call:

svm(formula = Purchase ~ ., data = OJ.train, kernel = "linear", cost = 0.01)

Parameters:

SVM-Type: C-classification

SVM-Kernel: linear

cost: 0.

gamma: 0.

Number of Support Vectors: 432

( 215 217 )

Number of Classes: 2

Levels:

CH MM

(c). What are the training and test error rates?

train.pred = predict (svm.linear, OJ.train) table (OJ.train$Purchase, train.pred)

train.pred

CH MM

CH 439 55

MM 78 228

test.pred = predict (svm.linear, OJ.test) table (OJ.test$Purchase, test.pred)

test.pred

CH MM

CH 141 18

MM 31 80

(d). Use the tune() function to select an optimal cost. Consider values in the range 0.01 to 10.

tune.out = tune (svm, Purchase ~ ., data = OJ.train, kernel = "linear", ranges = list (cost = 10^ seq (-2, 1, by = 0.25))) summary (tune.out)

Parameter tuning of 'svm':

- sampling method: 10-fold cross validation

- best parameters:

cost

0.

- best performance: 0.

- Detailed performance results:

cost error dispersion

1 0.01000000 0.16625 0.

2 0.01778279 0.16625 0.

3 0.03162278 0.16375 0.

4 0.05623413 0.16000 0.

5 0.10000000 0.16250 0.

6 0.17782794 0.16250 0.

7 0.31622777 0.16125 0.

8 0.56234133 0.16625 0.

9 1.00000000 0.16875 0.

10 1.77827941 0.16750 0.

11 3.16227766 0.16375 0.

12 5.62341325 0.16750 0.

13 10.00000000 0.16500 0.

(e). Compute the training and test error rates using this new value for cost.

test.pred = predict (svm.radial, OJ.test) table (OJ.test$Purchase, test.pred)

test.pred

CH MM

CH 141 18

MM 28 83

tune.out = tune (svm, Purchase ~ ., data = OJ.train, kernel = "radial", ranges = list (cost = 10^ seq (-2, 1, by = 0.25))) summary (tune.out)

Parameter tuning of 'svm':

- sampling method: 10-fold cross validation

- best parameters:

cost

0.

- best performance: 0.

- Detailed performance results:

cost error dispersion

1 0.01000000 0.38250 0.

2 0.01778279 0.38250 0.

3 0.03162278 0.38000 0.

4 0.05623413 0.20750 0.

5 0.10000000 0.18250 0.

6 0.17782794 0.17875 0.

7 0.31622777 0.16875 0.

8 0.56234133 0.17000 0.

9 1.00000000 0.17625 0.

10 1.77827941 0.17250 0.

11 3.16227766 0.17500 0.

12 5.62341325 0.17250 0.

13 10.00000000 0.18250 0.

svm.radial = svm (Purchase ~ ., data = OJ.train, kernel = "radial", cost = tune.out$best.parameters$cost) train.pred = predict (svm.radial, OJ.train) table (OJ.train$Purchase, train.pred)

train.pred

CH MM

CH 448 46

MM 78 228

test.pred = predict (svm.radial, OJ.test) table (OJ.test$Purchase, test.pred)

test.pred

CH MM

CH 144 15

MM 29 82

(g). Repeat parts (b) through (e) using a support vector machine with a polynomial kernel. Set degree=2.

svm.poly = svm (Purchase ~ ., data = OJ.train, kernel = "poly", degree = 2) summary (svm.poly)

Call:

svm(formula = Purchase ~ ., data = OJ.train, kernel = "poly", degree = 2)

Parameters:

SVM-Type: C-classification

SVM-Kernel: polynomial

cost: 1

degree: 2

gamma: 0.

coef.0: 0

Number of Support Vectors: 454

( 224 230 )

Number of Classes: 2

Levels:

CH MM

train.pred = predict (svm.poly, OJ.train) table (OJ.train$Purchase, train.pred)

train.pred

CH MM

CH 461 33

MM 105 201

test.pred = predict (svm.poly, OJ.test) table (OJ.test$Purchase, test.pred)

test.pred

CH MM

CH 149 10

MM 41 70

tune.out = tune (svm, Purchase ~ ., data = OJ.train, kernel = "poly", degree = 2, ranges = list (cost = 10^ seq (-2, 1, by = 0.25))) summary (tune.out)

Parameter tuning of 'svm':

- sampling method: 10-fold cross validation