Discussion:
[R-sig-ME] Help understanding an error Line Search Fails
Bill Poling
2018-12-05 17:45:10 UTC
Permalink
Good afternoon. I hope I have provided enough info to get my question answered.
I am running windows 10 -- R3.5.1 -- RStudio Version 1.1.456
Using caret package I have been comparing models using my data, a training subset N=17357.

I have run PLS, RDA, GLM, and Boosted Logit based on a couple of tutorials.

http://dataaspirant.com/2017/01/19/support-vector-machine-classifier-implementation-r-caret-package/

https://cran.r-project.org/web/packages/caret/vignettes/caret.html

https://topepo.github.io/caret/model-training-and-tuning.html

However, when I get to trying svmLinear or svmRadial they both produce error: line search fails -1.614732 -0.257144 0.00001920624 0.00001369617 -0.00000001857456 -0.00000001542947 -0.000000000000568072

I have done some googling research but cannot find a definitive answer as to why this model does not work with my data but the other models do?

https://stackoverflow.com/questions/43267209/line-search-fails-when-training-a-model-using-caret

https://stackoverflow.com/questions/15895897/line-search-fails-in-training-ksvm-prob-model



Any advice would be appreciated.

Thank you

WHP

str(training)

# 'data.frame':17357 obs. of 7 variables:
# $ SavingsReversed: num 0 0 0 0 0 ...
# $ productID : num 3 3 3 3 3 1 3 3 3 1 ...
# $ ProviderID : num 113676 114278 114278 114278 114278 ...
# $ ModCnt : num 0 1 1 1 1 1 1 0 0 1 ...
# $ B2 : num -1 -1 -1 -1 -1 -1 7 9 9 -1 ...
# $ B1a : num 1 1 1 1 1 1 26 26 26 3 ...
# $ EditnumberI : Factor w/ 2 levels "Bad","Good": 1 2 2 2 2 2 1 1 2 2 ...


head(training, n=25)

# SavingsReversed productID ProviderID ModCnt B2 B1a EditnumberI
# 1 0.00 3 113676 0 -1 1 Bad
# 5 0.00 3 114278 1 -1 1 Good
# 6 0.00 3 114278 1 -1 1 Good
# 7 0.00 3 114278 1 -1 1 Good
# 8 0.00 3 114278 1 -1 1 Good
# 10 0.00 1 114278 1 -1 1 Good
# 12 128.25 3 116641 1 7 26 Bad
# 13 159.60 3 116641 0 9 26 Bad
# 14 0.00 3 116641 0 9 26 Good
# 15 0.00 1 117280 1 -1 3 Good
# 16 1622.55 3 117439 1 9 26 Good
# 17 60.07 3 117439 1 9 26 Good
# 18 0.00 3 117439 0 -1 3 Good
# 19 190.00 3 117962 0 9 26 Good
# 20 372.66 3 119316 0 1 26 Bad
# 22 0.00 3 120431 1 -1 1 Good
# 25 0.00 3 121319 1 7 26 Bad
# 26 18.79 3 121319 1 7 26 Bad
# 27 23.00 3 121319 1 7 26 Bad
# 28 18.79 3 121319 1 7 26 Bad
# 29 0.00 3 121319 1 7 26 Bad
# 30 25.86 3 121319 2 7 26 Bad
# 31 14.00 3 121319 1 7 26 Bad
# 36 113.00 3 121545 1 1 26 Bad
# 37 197.20 3 121545 1 9 26 Bad


anyNA(training)
#[1] FALSE

My scripts

ctrl <- trainControl(
method = "repeatedcv",
repeats = 3,
classProbs = TRUE,
summaryFunction = twoClassSummary
)

set.seed(123)
svm_Linear <- train(EditnumberI ~., data = training,
method = "svmLinear",
trControl = ctrl,
preProcess = c("center", "scale"),
tuneLength = 10,
metric="ROC")
#warnings()
svm_Linear



set.seed(123)
svm_Radial <- train(EditnumberI ~., data = training,
method = "svmRadial",
trControl = ctrl,
preProcess = c("center", "scale"),
tuneLength = 10,
metric="ROC")
#warnings()
svm_Radial



line search fails -1.614732 -0.257144 0.00001920624 0.00001369617 -0.00000001857456 -0.00000001542947 -0.000000000000568072



WHP



Confidentiality Notice This message is sent from Zelis. ...{{dropped:13}}
Ben Bolker
2018-12-05 17:48:04 UTC
Permalink
As far as I can tell none of the model types you're using fall under
the category of "mixed models" (linear/generalized linear models with
data identified in known groups that are to be estimated by some form
of shrinkage estimator/"random effect"). (Please feel free to correct me!)

By the way, I don't think it makes any sense to use "ProviderID" as a
*numeric* predictor variable ... that (and ProductID) are places where
you *might* actually want to use a mixed model.

This looks like more of a CrossValidated question - note that you'll
have to provide a *reproducible* example in order to get help ...

cheers
Ben Bolker
Post by Bill Poling
Good afternoon. I hope I have provided enough info to get my question answered.
I am running windows 10 -- R3.5.1 -- RStudio Version 1.1.456
Using caret package I have been comparing models using my data, a training subset N=17357.
I have run PLS, RDA, GLM, and Boosted Logit based on a couple of tutorials.
http://dataaspirant.com/2017/01/19/support-vector-machine-classifier-implementation-r-caret-package/
https://cran.r-project.org/web/packages/caret/vignettes/caret.html
https://topepo.github.io/caret/model-training-and-tuning.html
However, when I get to trying svmLinear or svmRadial they both produce error: line search fails -1.614732 -0.257144 0.00001920624 0.00001369617 -0.00000001857456 -0.00000001542947 -0.000000000000568072
I have done some googling research but cannot find a definitive answer as to why this model does not work with my data but the other models do?
https://stackoverflow.com/questions/43267209/line-search-fails-when-training-a-model-using-caret
https://stackoverflow.com/questions/15895897/line-search-fails-in-training-ksvm-prob-model
Any advice would be appreciated.
Thank you
WHP
str(training)
# $ SavingsReversed: num 0 0 0 0 0 ...
# $ productID : num 3 3 3 3 3 1 3 3 3 1 ...
# $ ProviderID : num 113676 114278 114278 114278 114278 ...
# $ ModCnt : num 0 1 1 1 1 1 1 0 0 1 ...
# $ B2 : num -1 -1 -1 -1 -1 -1 7 9 9 -1 ...
# $ B1a : num 1 1 1 1 1 1 26 26 26 3 ...
# $ EditnumberI : Factor w/ 2 levels "Bad","Good": 1 2 2 2 2 2 1 1 2 2 ...
head(training, n=25)
# SavingsReversed productID ProviderID ModCnt B2 B1a EditnumberI
# 1 0.00 3 113676 0 -1 1 Bad
# 5 0.00 3 114278 1 -1 1 Good
# 6 0.00 3 114278 1 -1 1 Good
# 7 0.00 3 114278 1 -1 1 Good
# 8 0.00 3 114278 1 -1 1 Good
# 10 0.00 1 114278 1 -1 1 Good
# 12 128.25 3 116641 1 7 26 Bad
# 13 159.60 3 116641 0 9 26 Bad
# 14 0.00 3 116641 0 9 26 Good
# 15 0.00 1 117280 1 -1 3 Good
# 16 1622.55 3 117439 1 9 26 Good
# 17 60.07 3 117439 1 9 26 Good
# 18 0.00 3 117439 0 -1 3 Good
# 19 190.00 3 117962 0 9 26 Good
# 20 372.66 3 119316 0 1 26 Bad
# 22 0.00 3 120431 1 -1 1 Good
# 25 0.00 3 121319 1 7 26 Bad
# 26 18.79 3 121319 1 7 26 Bad
# 27 23.00 3 121319 1 7 26 Bad
# 28 18.79 3 121319 1 7 26 Bad
# 29 0.00 3 121319 1 7 26 Bad
# 30 25.86 3 121319 2 7 26 Bad
# 31 14.00 3 121319 1 7 26 Bad
# 36 113.00 3 121545 1 1 26 Bad
# 37 197.20 3 121545 1 9 26 Bad
anyNA(training)
#[1] FALSE
My scripts
ctrl <- trainControl(
method = "repeatedcv",
repeats = 3,
classProbs = TRUE,
summaryFunction = twoClassSummary
)
set.seed(123)
svm_Linear <- train(EditnumberI ~., data = training,
method = "svmLinear",
trControl = ctrl,
preProcess = c("center", "scale"),
tuneLength = 10,
metric="ROC")
#warnings()
svm_Linear
set.seed(123)
svm_Radial <- train(EditnumberI ~., data = training,
method = "svmRadial",
trControl = ctrl,
preProcess = c("center", "scale"),
tuneLength = 10,
metric="ROC")
#warnings()
svm_Radial
line search fails -1.614732 -0.257144 0.00001920624 0.00001369617 -0.00000001857456 -0.00000001542947 -0.000000000000568072
WHP
Confidentiality Notice This message is sent from Zelis. ...{{dropped:13}}
_______________________________________________
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
Loading...