Xiaorui
(Jeremy)
Zhu

Shee-ow-ree Joo

Assistant Professor of Business Analytics
Towson University

Image

About Me

I'm Xiaorui (Jeremy) Zhu, an Assistant Professor in the Department of Business Analytics and Technology Management, Towson University. I obtained my Ph.D. degree in Business Analytics from the University of Cincinnati, Lindner College of Business. Before that, I received an master degree in Finance from Penn State University and a master of Applied Statistics from Beijing University of Technology.

My research interests include high-dimensional statistics, categorical data analysis, machine learning, default prediction and sentiment analysis in finance and information systems, and creativity. Specifically, I study the variable selection methods and statistical inference in high-dimensional statistics. One of my works proposed the sparsified simultaneous confidence intervals for the high-dimensional linear regression model (SSCI). In finance, I work on bankruptcy, delisting problems, and stock return prediction using sentiment analysis. For the bankruptcy problem, I am interested in the prediction of bankruptcy and how the bankruptcy risk is associated the equity return. One of my projects focuses on bankruptcy and delistings due to other failure reasons, two closely related yet sharply different distress events. We identify two different models for bankruptcy risk and other-failure risk. In another published work, I proposed an adaptive method to estimate the coefficients of the GARCH model with heavy-tailed innovation. I'm also interested in how creativity can stimulate the development of machine learning algorithms and artificial intelligence.

I believe staying Hungry, Foolish, and Creative is the secret to success. I dream of being a statistician engineering a machine intelligence that is the "epitome" of the descendants of human intelligence.

Curriculum Vitae Github

Research

Publications

Working Paper

Research In Progress

  • Using Surrogate \(R^2\) to Assess Logistic Models in Bankruptcy Analysis: Explainability, Predictability, and Comparability,
    with Dungang Liu, on-going.
  • Additive Logistic Model with Macroeconomic Covariates for Corporate Bankruptcy Prediction.
    with Yan Yu, and Shaonan Tian, on-going.

Presentations

  • "Assessing Partial Association between Ordinal Variables Using PAsso R Package for American National Election and U.S. Bankruptcy,
    Invited talk, NESS 37th New England Statistics Symposium, Boston, MA, May 2023.
  • "Using surrogate R-squared to assess logistic models in bankruptcy analysis: explainability, predictability, and comparability",
    2023 Symposium on Data Science and Statistics, St. Louis, Missouri, May 2023.
  • "A new goodness-of-fit measure for categorical data goodness-of-fit analysis: surrogate \(R^2\)",
    BFF8, Cincinnati, OH, May 2023.
  • "Measuring goodness-of-fit for logit model and its application to U.S. and Polish bankruptcy prediction",
    ICSA Applied Statistics Symposium, Gainesville, FL, June 2022.
  • "Zooming in on Distress Anomaly: Bankruptcy vs. Other Failures",
    INFORMS Annual Meeting, Anaheim, CA, Oct 2021.
  • "Surrogate \(R^2\) for Probit Models",
    INFORMS Annual Meeting, Virtual, Nov 2020.
  • "Demystifying Differences between Bankruptcy and Other Failures in Terms of Major Drivers and Risk-Return Relationship",
    Session Chair, INFORMS Annual Meeting, Seattle, WA, Oct 2019.
  • "Conducting Inference of Coefficients and Model for High-Dimensional Sparse Linear Model",
    INFORMS 3rd Workshop on Data Science (Peer-Reviewed), Seattle, WA, Oct 2019.
  • "Simultaneous Confidence Intervals Using Entire Solution Paths", Poster
    The Fourth Workshop on Higher-Order Asymptotics and Post-Selection Inference (WHOA-PSI), St. Louis, MO, August 2019.
  • "Simultaneous Confidence Intervals Using Entire Solution Paths", Slides
    ASA 2019 Joint Statistical Meeting, Denver, CO, July, 2019.
  • "What drivers Bankruptcy and Other Failures? An Analysis with Variable Selection",
    INFORMS Annual Meeting, Data Science Workshop (Peer-Reviewed), Phoenix, AZ, November 2018
  • "Additive Logistic Model with Macroeconomic Covariates for Corporate Bankruptcy Prediction",
    ASA 2017 Joint Statistical Meeting, Baltimore, MD, August, 2017.
  • "Amplifying Creativity in Big Data Era: Relationship between creativity and failure",
    The 7th International Forum on Statistics, Beijing, China, 2016

Software

I'm interested in making research results more intuitive and understandable. Therefore, I use Shiny app for interactively telling interesting data story. Here are some latest shiny apps and R packages I created. Everyone interested in my research or software is welcome to contact me.

SurrogateRsq (R Package)

An R package for the Surrogate \(R^2\) measure for categorical data analysis. It can generate a point or interval measure of the surrogate \(R^2\), and a ranking measure of each variable's contribution.

PAsso (R Package)

An R package and implementation of the unified framework for assessing Parrtial Association between ordinal variables after adjusting for a set of covariates.

SSCI (R Package)

An R Package for constructing the sparsified simultaneous confidence intervals (SSCI).

SPSP (R Package)

An R Package for Selecting the relevant predictors by Partitioning the Solution Paths of the Penalized Likelihood Approach.

Trading Strategies (App)

It is an online platform to display interactive trading strategies.

Stock Display (App)

It is an Shiny app that is used to display time-series data and forecasting of stocks.

Bankruptcy Prediction (App)

It is live app showing probability of bankruptcy of public companies in US.

Teaching

Latest Products Image

Data Mining for Business Analytics (Towson)
Instructor

This course emphasizes hands-on data analysis experience using the most recent progression in data mining and machine learning for business analytics with statistical software R. Topics include modern data wrangling techniques, data visualization, linear regression, logistic regression, variable selection, model evaluation, K-nearest neighbors, classification and regression trees (CART), etc.

Syllabus
Latest Products Image

Business Analytics (Towson)
Instructor

This course focuses on using standard business analytic models to summarize and analyze data, build models, and drive impact through quantitative decision-making.

Syllabus
Latest Products Image

Data Wrangling with R (UC)
Instructor

Data Wrangling with R! This course provides an intensive, hands-on introduction to Data Wrangling with the R programming language. You will learn the fundamental skills required to acquire, munge, transform, manipulate, and visualize data in a computing environment that fosters reproducibility.

Syllabus       Lab Notes
Latest Products Image

Forecasting and Time Series Methods (UC)
Instructor

This course covers time series analysis, emphasizing the appropriate models for estimation, testing, and forecasting. For example, Univariate Box-Jenkins for fitting and forecasting time series; ARIMA models, stationarity and nonstationarity; diagnosing time series models; forecasting, point, and interval forecasts, seasonal time series models, modeling volatility with ARCH, GARCH, , and other methods. The R Shiny App development is also covered to help students obtain skills in making a prototype of their models and ideas.

Undergraduate       Graduate
Latest Products Image

Business Analytics (UC)
Instructor

This course develops fundamental knowledge and skills for applying statistics to business decision-making. Topics include descriptive statistics, probability distributions, sampling, confidence intervals, hypothesis testing, and computer software for statistical applications. (2018 Spring & Fall)

Syllabus
Latest Products Image

Data Mining I & II (UC)
Instructor

The statistical methods in these two courses include Linear Regression, Generalized Linear Models (e.g., Logistic regression), Variable Selection, Cross Validation, k-nearest neighbors, Classification and Regression Trees (CART), Bagging, Boosting, Random Forests, Generalized Additive Models (GAM for Nonlinearity), Nonparametric Smoothing; Neural Network, Clustering(K-means clustering, Support Vector Machine), Principal Component Analysis, Association Rules, and Text Mining.

Syllabus       Lab Notes

Secret to Success!

Staying Hungry, Foolish, and Creative!

Top