Data Mining in R

This set of learning materials for undergraduate and graduate data mining class is currently maintained by Xiaorui Zhu. Many materials are from Dr. Yan Yu’s previous class notes. Thanks for the contribution from previous Ph.D. students in Lindner College of Business. Thanks to Dr. Brittany Green for recording the videos.

Lecture and Lab Notes

Introduction to Data Mining and R

Lab Notes Video Exercise
1.A Introduction to Data Mining    
1.B Introduction to R   1.B
1.C Advanced techniques: function and loop   1.C
1.D Introduction to RMarkdown (optional)    

Exploratory Data Analysis

Lab Notes Video Exercise
2.A Explore and describe dataset   2.A
2.B Exploratory data analysis by visualization   2.B
2.C tidyverse: R packages for EDA (optional)    

Linear Regression, Prediction and Variables Seleciton

Lab Notes Video Exercise
3.A Linear regression and prediction   3.A
3.B Subset variable selection   3.B
3.C LASSO variable selection   3.C
3.D Monte Carlo simulation    

Logistic Regression

Lab Notes Video Exercise
4.A Logistic regression and prediction   4.A
4.B Logistic regression and variable selection   4.B
4.C Logistic Regression for binary classification   4.C
4.D Logistic regression and ROC   4.D

Cross Validation

Lab Notes Video Exercise
5.A Cross validation   5.A
5.B Cross validation (Logit model)   5.B

Tree Models

Lab Notes Video Exercise
6.A Regression Trees   6.A
6.B Classification Trees   6.B

Advanced Tree Models: Bagging, Random Forests, and Boosting Tree

Lab Notes Video Exercise
7.A Bagging trees    
7.B Random forests   7.B
7.C Boosting trees   7.C

Nonlinearity, Generalized Additive Models (GAM), and Nonparametric Smoothing

Lab Notes Video Exercise
8.A Univariate Nonparametric Smoothing    
8.B Generalized additive model (GAM)   8.B

Neural Network, LDA, and SVM

Lab Notes Video Exercise
9.A Neural network models   9.A
9.B (Optional) Discriminant analysis    
9.C (Optional) Support vector machine (SVM)    

Unsupervised Learning: Clustering

Lab Notes Video Exercise
10 Clustering   10

Unsupervised Learning: Association Rules

Lab Notes Video Exercise
11 Association Rules   11

Other Topics 1: Basic Text Mining

Basic Text Mining