Binning Component

I noticed that there is no standard component for Binning in SAP PA although Binning is required at many places for doing analysis.

In several statistical analyses there is a need for having categorical variables rather than continuous variables. Especially in credit scoring models continuous variables are often transformed into categorical variables for better analysis. Also in case of big data analysis is faster if we use categorical variables as opposed to continuous ones. The process of converting continuous variables into categorical variables is called Binning. In simpler words Binning is a way to group a number of more or less continuous values into a smaller number of "bins". For e.g. , if you have data about a group of people, you might want to arrange their ages into a smaller number of age intervals. The component below allows you to bin a continuous variable into n equally sized (by number of observations) bins.

I have taken a sample data of a credit card client. They have assigned scores to customers based on credit limit and now want to classify the number of customers based on the score range that they fall in. Based on the range I went ahead and created a Mosaic plot. We can further use this data in our predictive algorithms to predict the score range of a new customer based on other variables. One can modify the code as per their need:

Setting up the component:

Column to be Categorized: Give the continuous variable that you want to convert to a categorical variable. Needs to be numeric.

Number of Categories: The number of categories that will be created for the continuous variable above.

Output:

As seen below the variable (Score) is now categorized into 4 different categories of equal distribution.

CODE:

mymain <- function (mydata, BinColumnStr, numBrks)

{

## Package Required for Creating Mosaic Plot

library(vcd)

## Capturing the column that needs to be categorized

mycolumn <- mydata[,BinColumnStr]

## Creating the Categories

mydata$Category<- cut(mycolumn, breaks=as.numeric(numBrks), include.lowest=TRUE)

## Tabulating the categories for Mosaic Plot

output1 <- xtabs(Count~Region+Category, data=mydata)

## Aggregating the count based on Region & Category

myaggregation<- aggregate(Count ~ Region+Category, data=mydata, FUN=sum)

output <- data.frame(myaggregation$Region, myaggregation$Category, myaggregation$Count)

## Creating Mosaic Plot

mosaic(output1, shade=TRUE)

return(list(out=output))

}

Please put your comments if there is anything I can add to this code.

Binning Component

Trending Articles

Bath man appears in court charged with attempted murder of a man...

MACLEAN, Allan

Black Angus Grilled Artichokes

Practice Sheet of Right form of verbs for HSC Students

Police blotter for Jan. 12

99 God Status for Whatsapp, Facebook

Rajasthan Board 12th Science Result 2018 name wise- RBSE 12th commerce result...

Notorious Naushad of Ippa gang nabbed

Child Kidnapping: Amy McNeil was kidnapped on her way to school by 5 adults;...

Sonible Smartlimit v1.1.5-R2R

NCERT Solutions for Class 9th Sanskrit Chapter 3 पाथेयम्

मतलबी दोस्त स्टेट्स | Matlabi Dost Status in Hindi – Selfish Friends Status

Arrow Flash 2 – Sinhala Dubbed – Episode 23 – 20th March 2016

[GET] AI Traffic Goldmine

[E² Plugin] HDF-Radio

Universal Multi-Patch v1.3 By RADIXX11

IWAN – Thanks and Praise ( Throw Back Thursday )

RONALD P SONDERGAARD Arrested by Miami-Dade County Corrections on Mar 03, 2017

मुख मैथुन से उठाएं सेक्स का भरपूर मज़ा, जानें क्या है इसका सही तरीकामुख मैथुन...

HSSC Excise & Taxation Inspector Result 2017 Scorecard/ Category Wise Merit List