1 Neural Networks Models

Neural Networks method (in-sample and out-of-sample performance measure) is illustrated here. The package neuralnet and nnet are used for this purpose.

neuralnet package

The arguments:

hidden: a vector of integers specifying the number of hidden neurons (vertices) in each layer.
rep: the number of repetitions for the neural network’s training.
startweights: a vector containing starting values for the weights. The weights will not be randomly initialized.
linear.output: TRUE for continuous response FALSE for categorical response.

nnet package

The arguments:

size: number of units in the hidden layer.
maxit: maximum number of iterations. Default 100.
decay: parameter for weight decay. Default 0.
linout: TRUE for continuous response FALSE for categorical response (default)
weights: (case) weights for each example – if missing defaults to 1.

1.1 Regression

For regression problems, we use neuralnet and add linear.output = TRUE when training model. In practices, the normalization and standardization for predictors and response variable are recommended before training a neural network model. Otherwise, your neural network model may not be able to converge as the following case:

nn <- neuralnet(f, data=train_Boston, hidden=c(5), linear.output=T)
# Algorithm did not converge in 1 of 1 repetition(s) within the stepmax.

I chose to use the min-max method and scale the data in the interval \([0,1]\). Other reference online: [1]

library(MASS)
data("Boston")

maxs <- apply(Boston, 2, max) 
mins <- apply(Boston, 2, min)

scaled <- as.data.frame(scale(Boston, center = mins, scale = maxs - mins))
index <- sample(1:nrow(Boston),round(0.9*nrow(Boston)))

train_Boston <- scaled[index,]
test_Boston <- scaled[-index,]

Plot the fitted neural network model:

library(neuralnet)
f <- as.formula("medv ~ .")

# Or you can do the following way that is general and will ease your pain to manually update formula:
# resp_name <- names(train_Boston)
# f <- as.formula(paste("medv ~", paste(resp_name[!resp_name %in% "medv"], collapse = " + ")))

nn <- neuralnet(f,data=train_Boston, hidden=c(5,3), linear.output=T)
plot(nn)

Calculate the MSPE of the above neural network model:

pr_nn <- compute(nn, test_Boston[,1:13])

# recover the predicted value back to the original response scale 
pr_nn_org <- pr_nn$net.result*(max(Boston$medv)-min(Boston$medv))+min(Boston$medv)
test_r <- (test_Boston$medv)*(max(Boston$medv)-min(Boston$medv))+min(Boston$medv)

# MSPE of testing set
MSPE_nn <- sum((test_r - pr_nn_org)^2)/nrow(test_Boston)
MSPE_nn

## [1] 7.435858

Remark: If the testing set is not available in practice, you may try to scale the data based on the training set only. Then the recovering process should be changed accordingly.

1.2 Classification on Bankruptcy dataset

For classification problems, we use neuralnet and add linear.output = FALSE when training model. A common practice is again to scale/standardize predictor variables.

Bank_data_scaled <- Bank_data <- 
  read.csv(file = "https://xiaoruizhu.github.io/Data-Mining-R/lecture/data/bankruptcy.csv", header=T)
# summary(Bank_data)
library(MASS)
maxs <- apply(Bank_data[,-c(1:3)], 2, max)
mins <- apply(Bank_data[,-c(1:3)], 2, min)
Bank_data_scaled[,-c(1:3)] <- as.data.frame(scale(Bank_data[,-c(1:3)], center = mins, scale = maxs - mins))

sample_index <- sample(nrow(Bank_data_scaled),nrow(Bank_data_scaled)*0.70)
Bank_train <- Bank_data_scaled[sample_index,]
Bank_test <- Bank_data_scaled[-sample_index,]

library(neuralnet)
f <- as.formula("DLRSN ~ R1 + R2 + R3 + R4 + R5 + R6 + R7 + R8 + R9 + R10")
# You may need to specific the formula rather than 
Bank_nn <- neuralnet(f, data=Bank_train, hidden=c(3), algorithm = 'rprop+', linear.output=F, likelihood = T)
plot(Bank_nn)

In-sample fit performance

pcut_nn <- 1/36
prob_nn_in <- predict(Bank_nn, Bank_train, type="response")
pred_nn_in <- (prob_nn_in>=pcut_nn)*1
table(Bank_train$DLRSN, pred_nn_in, dnn=c("Observed","Predicted"))

##         Predicted
## Observed    0    1
##        0 1725 1530
##        1   24  526

In-sample ROC Curve

library(ROCR)
pred <- prediction(prob_nn_in, Bank_train$DLRSN)
perf <- performance(pred, "tpr", "fpr")
plot(perf, colorize=TRUE)

#Get the AUC
unlist(slot(performance(pred, "auc"), "y.values"))

## [1] 0.9045591

Model AIC/BIC and mean residual deviance

Bank_nn$result.matrix[4:5,]

##      aic      bic 
## 347.2087 578.2393

Out-of-sample fit performance

prob_nn_out <- predict(Bank_nn, Bank_test, type="response")
pred_nn_out <- (prob_nn_out>=pcut_nn)*1
table(Bank_test$DLRSN, pred_nn_out, dnn=c("Observed","Predicted"))

##         Predicted
## Observed   0   1
##        0 729 676
##        1  15 211

Out-of-sample ROC Curve

pred <- prediction(prob_nn_out, Bank_test$DLRSN)
perf <- performance(pred, "tpr", "fpr")
plot(perf, colorize=TRUE)

#Get the AUC
unlist(slot(performance(pred, "auc"), "y.values"))

## [1] 0.8717853

go to top