Download the customer data and read into R.
customer <- read.csv(file = "https://xiaoruizhu.github.io/Data-Mining-R/lecture/data/CustomerData.csv")
- How many rows and columns of the dataset?
- Print first few rows the dataset.
- Obtain the summary statistics (Min, Median, Max, Mean and Std.) for Age, EducationYears, HHIncome, and CreditDebt.
- Obtain the mean of HHIncome by MaritalStatus
- Obtain a pivot table of LoanDefault vs. JobCategory. Which Job Category has the highest and lowerst loan default rate?
- Obtain a dataset “iris_select” that drop the first and second column by using dataname[, “variable_index”];
- Create new variable Sepal_LW equals to the ratio of sepal length to sepal width. (without using mutate());
- How to get only those variables that contain missing values?
- Random sample a training data set that contains 80% of original data points.