[R-sig-ME] Help with split data routine and subsequent predict function --caret and klaR pkgs

Bill Poling

2018-11-27 16:57:14 UTC

R=3.5.1
Windows=10
RStudio Version = 1.1.456

Hello I am following this split data routine located at:
https://machinelearningmastery.com/how-to-estimate-model-accuracy-in-r-using-the-caret-package/

When I get to the "predictions <- predict(model, x_test)" below I am getting the following error:
#Error in `[[<-.data.frame`(`*tmp*`, i, value = integer(0)) : replacement has 0 rows, data has 4628

So I checked again for the usual culprit being NA's but there are none?

I Thought maybe "tryCatch()" might help but it isn't working for me either? LOL!
#Error: unexpected ')' in "tryCatch({predictions <- predict(model, x_test)}, error = function(e)print(e), warning = function(w))"

Thank you for any insight and direction.

WHP

str(r1a1)
# Classes 'data.table' and 'data.frame':23141 obs. of 8 variables:
# $ SavingsReversed: num 0 0 0 0 0 0 0 0 0 0 ...
# $ productID : num 3 3 3 3 3 3 3 3 1 1 ...
# $ ProviderID : num 113676 113676 113964 113964 114278 ...
# $ ModCnt : num 0 0 0 0 1 1 1 1 1 1 ...
# $ Editnumber2 : num 0 0 1 1 1 1 1 1 1 1 ...
# $ B2 : num -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 ...
# $ B1a : num 1 1 3 3 1 1 1 1 1 1 ...
# $ PatientGender2 : num 0 0 1 1 1 1 0 0 0 0 ...
# - attr(*, ".internal.selfref")=<externalptr>

tail(r1a1)
SavingsReversed productID ProviderID ModCnt Editnumber2 B2 B1a PatientGender2
1: 0.00 3 6266065 0 0 9 26 1
2: 32.61 3 6266065 0 0 9 26 0
3: 0.00 1 6266651 0 1 9 26 1
4: 0.00 3 6270643 2 1 7 26 0
5: 0.00 3 6270643 0 1 -1 3 0
6: 0.00 3 6273280 0 0 9 26 0

#reorg r1a1
r1a2 <- r1a1[,c(5,1,2,3,4,6,7,8)]
str(r1a2)
#Data Split
# define an 80%/20% train/test split of the dataset
split=0.80
trainIndex <- createDataPartition(r1a1$Editnumber2, p=split, list=FALSE)
str(trainIndex) # abbreviated here
#int [1:18513, 1] 2 3 5 7 8 9 10 11 12 14 ...
# - attr(*, "dimnames")=List of 2
# ..$ : NULL
# ..$ : chr "Resample1"
data_train <- r1a1[ trainIndex,]
str(data_train) #abbreviated here
#Classes 'data.table' and 'data.frame':18513 obs. of 8 variables:
data_test <- r1a1[-trainIndex,]
str(data_test)# abbreviated here
#Classes 'data.table' and 'data.frame':4628 obs. of 8 variables:

# train a naive bayes model
# install.packages("klaR")
# library(klaR)
model <- naiveBayes(Editnumber2~., data=data_train)
# make predictions
x_test <- data_test[,2:8]
y_test <- data_test[,1]
predictions <- predict(model, x_test)
#Error in `[[<-.data.frame`(`*tmp*`, i, value = integer(0)) : replacement has 0 rows, data has 4628
row.has.na <- apply(r1a1, 1, function(x){any(is.na(x))})
sum(row.has.na) #48
View(row.has.na)
#[1] 0

??tryCatch
tryCatch({predictions <- predict(model, x_test)}, error = function(e)print(e), warning = function(w))

#NOT RUN
# summarize results
confusionMatrix(Editnumber2, y_test)

Confidentiality Notice This message is sent from Zelis. ...{{dropped:13}}