[R-sig-ME] Help understanding an error Line Search Fails
Bill Poling
2018-12-05 17:45:10 UTC
Good afternoon. I hope I have provided enough info to get my question answered.
I am running windows 10 -- R3.5.1 -- RStudio Version 1.1.456
Using caret package I have been comparing models using my data, a training subset N=17357.

I have run PLS, RDA, GLM, and Boosted Logit based on a couple of tutorials.




However, when I get to trying svmLinear or svmRadial they both produce error: line search fails -1.614732 -0.257144 0.00001920624 0.00001369617 -0.00000001857456 -0.00000001542947 -0.000000000000568072

I have done some googling research but cannot find a definitive answer as to why this model does not work with my data but the other models do?



Any advice would be appreciated.

Thank you



# 'data.frame':17357 obs. of 7 variables:
# $ SavingsReversed: num 0 0 0 0 0 ...
# $ productID : num 3 3 3 3 3 1 3 3 3 1 ...
# $ ProviderID : num 113676 114278 114278 114278 114278 ...
# $ ModCnt : num 0 1 1 1 1 1 1 0 0 1 ...
# $ B2 : num -1 -1 -1 -1 -1 -1 7 9 9 -1 ...
# $ B1a : num 1 1 1 1 1 1 26 26 26 3 ...
# $ EditnumberI : Factor w/ 2 levels "Bad","Good": 1 2 2 2 2 2 1 1 2 2 ...

head(training, n=25)

# SavingsReversed productID ProviderID ModCnt B2 B1a EditnumberI
# 1 0.00 3 113676 0 -1 1 Bad
# 5 0.00 3 114278 1 -1 1 Good
# 6 0.00 3 114278 1 -1 1 Good
# 7 0.00 3 114278 1 -1 1 Good
# 8 0.00 3 114278 1 -1 1 Good
# 10 0.00 1 114278 1 -1 1 Good
# 12 128.25 3 116641 1 7 26 Bad
# 13 159.60 3 116641 0 9 26 Bad
# 14 0.00 3 116641 0 9 26 Good
# 15 0.00 1 117280 1 -1 3 Good
# 16 1622.55 3 117439 1 9 26 Good
# 17 60.07 3 117439 1 9 26 Good
# 18 0.00 3 117439 0 -1 3 Good
# 19 190.00 3 117962 0 9 26 Good
# 20 372.66 3 119316 0 1 26 Bad
# 22 0.00 3 120431 1 -1 1 Good
# 25 0.00 3 121319 1 7 26 Bad
# 26 18.79 3 121319 1 7 26 Bad
# 27 23.00 3 121319 1 7 26 Bad
# 28 18.79 3 121319 1 7 26 Bad
# 29 0.00 3 121319 1 7 26 Bad
# 30 25.86 3 121319 2 7 26 Bad
# 31 14.00 3 121319 1 7 26 Bad
# 36 113.00 3 121545 1 1 26 Bad
# 37 197.20 3 121545 1 9 26 Bad

#[1] FALSE

My scripts

ctrl <- trainControl(
method = "repeatedcv",
repeats = 3,
classProbs = TRUE,
summaryFunction = twoClassSummary

svm_Linear <- train(EditnumberI ~., data = training,
method = "svmLinear",
trControl = ctrl,
preProcess = c("center", "scale"),
tuneLength = 10,

svm_Radial <- train(EditnumberI ~., data = training,
method = "svmRadial",
trControl = ctrl,
preProcess = c("center", "scale"),
tuneLength = 10,

line search fails -1.614732 -0.257144 0.00001920624 0.00001369617 -0.00000001857456 -0.00000001542947 -0.000000000000568072


Ben Bolker
2018-12-05 17:48:04 UTC
As far as I can tell none of the model types you're using fall under
the category of "mixed models" (linear/generalized linear models with
data identified in known groups that are to be estimated by some form
of shrinkage estimator/"random effect"). (Please feel free to correct me!)

By the way, I don't think it makes any sense to use "ProviderID" as a
*numeric* predictor variable ... that (and ProductID) are places where
you *might* actually want to use a mixed model.

This looks like more of a CrossValidated question - note that you'll
have to provide a *reproducible* example in order to get help ...

Ben Bolker
