Hello everybody,
I was wondering if it is possible to use the partition component in combination with a custom R component.
The goal is to analyse the results with the model statistics component afterwards.
Unfortunately when I use the Partition component, every custom R component throws the error:
"The output variable is missing" But works fine without the partition.
I am aware of the solution to use a sample, save the model and then test it on a test set.
But I think this could be interesting for other R users as well, and it is implemented in the SAP PA R algorithms as well.
I am thankful for any advice.
Here is the R script trying to use the C5.0 with the created training, test and validation set:
c50Function <- function(mydata,IndependentColumns,DependentColumns) {
library(C50)
if(is.null(mydata$PartitionValues) == TRUE) {
C50model <- C5.0(x = mydata[,IndependentColumns], y = as.factor(mydata[,DependentColumns]))
predicted <- predict(C50model, mydata[,IndependentColumns], type="prob")
predictedClass <- predict(C50model, mydata[,IndependentColumns], type="class")
out <- cbind(mydata,predicted)
out <- cbind(out,predictedClass)
return (list(result=out, modelname = C50model))
}
else {
trainingSet <- mydata[mydata$PartitionValues == 1,]
testSet <- mydata[mydata$PartitionValues == 2,]
validSet <- mydata[mydata$PartitionValues == 3,]
C50model <- C5.0(x = trainingSet[,IndependentColumns], y = as.factor(trainingSet[,DependentColumns]))
predictedClass <- predict(C50model, trainingSet[,IndependentColumns], type="class")
predictedTest <- predict(C50model, testSet[,IndependentColumns], type="class")
predictedValid <- predict(C50model, validSet[,IndependentColumns], type="class")
predictedAll <- c(predictedClass,predictedTest)
predictedAll <- c(predictedAll,predictedValid)
out <- cbind(mydata,predictedAll)
return (list(result=out, modelname = C50model))
}
}
Kind regards,
Heiko