Quantcast
Channel: SCN : All Content - SAP BusinessObjects Predictive Analytics
Viewing all articles
Browse latest Browse all 836

Custom R Component - Logistic Regression

$
0
0

This component adds a logistic regression to SAP Predictive Analysis.

 

A confusion matrix is automatically shown when training or testing the model. When applying the model on data, for which the actual classificationis not known, a frequency plot of the predicted classification is displayed.

 

logistic01.JPG

Disclaimer

Please note that this component is provided as-is without any guarantee or support.

 

Prerequisites

The classifier column must contain only the values 0 and 1.

The column names must not include a minus sign.

R libraries lattice, ggplot2 and caret have to be installed.

 

Usage

These parameters can be set by the user.

ParameterDescription
Predictor ColumnsNames of the predictor columns.
Classifier ColumnName of the target column. Must contain only the values 0 and 1.
ConfigurationOptional R configuration for the logistic regression.
Classification Threshold

Threshold for classifying either 0 or 1. Default is 0.5.

 

Output Columns added by this Component

ColumnDescription
PredictedProbabilityPredicted probability. Value between 0 and 1.
PredictedValuePredicted value based on the classification threshold and the predicted probability.

 

 

How to Implement

The component is attached to this article. Download and unzip the file. You will see a text file. Rename file's .txt extension to .zip and unpack the new file as well. The content of the .zip file is the Custom R Component. These steps are needed as SCN does not allow the attachment of the component's original file type.

 

Then deploy the component as described here. You just need to copy the attached content in a folder described in the article and restart SAP Predictive Analysis.

 

Example

If you want to try this logistic regression on some sample data, you can use the Adult dataset as used in the article on the Naive Bayes Algorithm. Just remember that the column names must not include a minus sign and ensure to transform the target column into binary 0 and 1 coding.

 

logistic02.JPG

 

Configure the component appropriately. To get started you only have to set the predictor and classifier columns. For the remaining settings you can keep the default values.

logistic03.JPG

 

Run the model and you can see the predicted values either a raw data or in the embedded confusion matrix.

logistic04.JPG

logistic05.JPG

 

Now save the trained model. Then add it as additional component into the testing-branch of the analytical flow.

 

logistic06.JPG

 

Execute the component and go in the "Results" panel to the "Custom Chart" and you will see that another confusion matrix has been created. The component was able to identify automatically that the real classification is already known. If the classifier column (that was specified when training the model) exists in the dataset, the component assumes that it is tested on already classified data. Therefore it displays the confusion matrix to help evaluate the model's performance.

 

logistic08.JPG

 

When applying the model on new data, for which the real classification is not known, the component will display a frequency plot of the predictions.

logistic09.JPG


Viewing all articles
Browse latest Browse all 836

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>