A user wants to calculate the Gini index for each row of the database. Each row is a customer and each column is a monthly session. So how to add a column with the Gini index by row, for each customer throughout the 12 months. Below is the code he implemented

1.0K    Asked by NikitaGavde in Data Science , Asked on Nov 17, 2019
Answered by Nikita Gavde

Gini_index <- apply(DT_file[,c('sessions_201607_pct','sessions_201608_pct', 'sessions_201609_pct','sessions_201610_pct','sessions_201611_pct','sessions_201612_pct','sessions_201701_pct','sessions_201702_pct','sessions_201703_pct','sessions_201704_pct','sessions_201705_pct','sessions_201706_pct')], 1, gini)

He gets the following error

Error in match.fun(FUN) : object 'gini' not found

We have to perform gini coefficient by column

library(ineq)

coeff= NULL

for (i in colnames(your_data[,-1])){

  coeff= c(coeff,round(ineq(your_data[,i],type = 'Gini'),4))

}

data_coeff = data.frame(cbind(coeff,colnames(your_data[,-1])))

colnames(data_coeff) = c("Coeff","Colnames")

We can also do this by row

your_new_data = as.data.frame(t(your_data[,-1]), row.names =T)

colnames(your_new_data) = your_data[,1]

ind = NULL

for (i in colnames(your_new_data)){

  ind = c(ind,round(ineq(your_new_data[,i],type = 'Gini'),4))

}

data_coeff= data.frame(cbind(ind,colnames(your_new_data)))

colnames(data_coeff) = c("Coeff","customer")

Now we can add the coefficients with our dataframe

your_data_final = merge(your_data,data_coeff, by = "customer" )



Your Answer

Interviews

Parent Categories