Xiaorui Zhu

About Me

I'm Xiaorui (Jeremy) Zhu, an Assistant Professor in the Department of Business Analytics and Technology Management, Towson University. I obtained my Ph.D. degree in Business Analytics from the University of Cincinnati, Lindner College of Business. Before that, I received an master degree in Finance from Penn State University and a master of Applied Statistics from Beijing University of Technology.

My research interests include high-dimensional statistics, categorical data analysis, machine learning, default prediction and sentiment analysis in finance and information systems, and creativity. Specifically, I study the variable selection methods and statistical inference in high-dimensional statistics. One of my works proposed the sparsified simultaneous confidence intervals for the high-dimensional linear regression model (SSCI). In finance, I work on bankruptcy, delisting problems, and stock return prediction using sentiment analysis. For the bankruptcy problem, I am interested in the prediction of bankruptcy and how the bankruptcy risk is associated the equity return. One of my projects focuses on bankruptcy and delistings due to other failure reasons, two closely related yet sharply different distress events. We identify two different models for bankruptcy risk and other-failure risk. In another published work, I proposed an adaptive method to estimate the coefficients of the GARCH model with heavy-tailed innovation. I'm also interested in how creativity can stimulate the development of machine learning algorithms and artificial intelligence.

I believe staying Hungry, Foolish, and Creative is the secret to success. I dream of being a statistician engineering a machine intelligence that is the "epitome" of the descendants of human intelligence.

Curriculum Vitae Github

Research

Publications

Impact of the COVID-19 Pandemic on the Stock Market and Investor Online Word of Mouth,
Xiaorui Zhu, Shaobo Li, Karthik Srinivasan, Michael Lash (2021), Decision Support Systems, 2024.
Corporate Probability of Default: A Single-Index Hazard Model Approach,
Shaobo Li, Yan Yu, Shaonan Tian, Xiaorui Zhu, Heng Lian (2022),
Journal of Business & Economic Statistics, DOI: 10.1080/07350015.2022.2120484
The effects of NASDAQ delisting on firm performance,
Mingsheng Li, Karen Liu, Xiaorui Zhu (2023), Research in International Business and Finance, 2024.
A New Goodness-of-fit Measure for Probit Models: Surrogate \(R^2\),
Dungang Liu, Xiaorui Zhu, Brandon Greenwell, Zewei Lin (2023),
British Journal of Mathematical and Statistical Psychology , 76, 192– 210.
PAsso: an R Package for Assessing Partial Association between Ordinal Variables,
Shaobo Li, Xiaorui Zhu, Yuejie Chen, Dungang Liu (2021), The R Journal. 13, no. 2: 135.
Exploring the Diversity of Creative Prototyping in a Global Online Learning Environment,
Kathryn Jablokow, Xiaorui Zhu, and Jack Matson. (2020).
International Journal of Design Creativity and Innovation, 1-23.
Adaptive Quasi-maximum Likelihood Estimation of GARCH Models with Student's t Likelihood,
Xiaorui Zhu, and Li Xie. (2016).
Communications in Statistics-Theory and Methods 45, no. 20 (2016): 6102-6111.
Stimulating Creativity in Online Learning Environments through Intelligent Fast Failure (IFF),
Kathryn Jablokow, Xiaorui Zhu, Jack Matson, and Akshay Kakde. (2016).
ASEE 2016 Annual Conference & Exposition, Conference Proceedings (Vol. 2016)

Working Paper

Sparsified Simultaneous Confidence Intervals for High-Dimensional Linear Models,
Xiaorui Zhu, Yichen Qin, and Peng Wang (2020), invited revision.
(Relevant package "SPSP": )
SurrogateRsq: an R package for categorical data goodness-of-fit analysis using the surrogate \(R^2\),
Xiaorui Zhu, Zewei Lin, Dungang Liu, and Brandon Greenwell (2023), invited revision.
Zooming in on Distress Risk: Bankruptcy vs. Other Failures
Yuhang Xing, Yan Yu, Xiaorui Zhu (2021, alphabetical authorship, authors contributed equally), under review.

Research In Progress

Using Surrogate \(R^2\) to Assess Logistic Models in Bankruptcy Analysis: Explainability, Predictability, and Comparability,
with Dungang Liu, on-going.
Additive Logistic Model with Macroeconomic Covariates for Corporate Bankruptcy Prediction.
with Yan Yu, and Shaonan Tian, on-going.

Presentations

"Assessing Partial Association between Ordinal Variables Using PAsso R Package for American National Election and U.S. Bankruptcy,
Invited talk, NESS 37th New England Statistics Symposium, Boston, MA, May 2023.
"Using surrogate R-squared to assess logistic models in bankruptcy analysis: explainability, predictability, and comparability",
2023 Symposium on Data Science and Statistics, St. Louis, Missouri, May 2023.
"A new goodness-of-fit measure for categorical data goodness-of-fit analysis: surrogate \(R^2\)",
BFF8, Cincinnati, OH, May 2023.
"Measuring goodness-of-fit for logit model and its application to U.S. and Polish bankruptcy prediction",
ICSA Applied Statistics Symposium, Gainesville, FL, June 2022.
"Zooming in on Distress Anomaly: Bankruptcy vs. Other Failures",
INFORMS Annual Meeting, Anaheim, CA, Oct 2021.
"Surrogate \(R^2\) for Probit Models",
INFORMS Annual Meeting, Virtual, Nov 2020.
"Demystifying Differences between Bankruptcy and Other Failures in Terms of Major Drivers and Risk-Return Relationship",
Session Chair, INFORMS Annual Meeting, Seattle, WA, Oct 2019.
"Conducting Inference of Coefficients and Model for High-Dimensional Sparse Linear Model",
INFORMS 3rd Workshop on Data Science (Peer-Reviewed), Seattle, WA, Oct 2019.
"Simultaneous Confidence Intervals Using Entire Solution Paths", Poster
The Fourth Workshop on Higher-Order Asymptotics and Post-Selection Inference (WHOA-PSI), St. Louis, MO, August 2019.
"Simultaneous Confidence Intervals Using Entire Solution Paths", Slides
ASA 2019 Joint Statistical Meeting, Denver, CO, July, 2019.
"What drivers Bankruptcy and Other Failures? An Analysis with Variable Selection",
INFORMS Annual Meeting, Data Science Workshop (Peer-Reviewed), Phoenix, AZ, November 2018
"Additive Logistic Model with Macroeconomic Covariates for Corporate Bankruptcy Prediction",
ASA 2017 Joint Statistical Meeting, Baltimore, MD, August, 2017.
"Amplifying Creativity in Big Data Era: Relationship between creativity and failure",
The 7th International Forum on Statistics, Beijing, China, 2016

Software

I'm interested in making research results more intuitive and understandable. Therefore, I use Shiny app for interactively telling interesting data story. Here are some latest shiny apps and R packages I created. Everyone interested in my research or software is welcome to contact me.

SurrogateRsq (R Package)

An R package for the Surrogate \(R^2\) measure for categorical data analysis. It can generate a point or interval measure of the surrogate \(R^2\), and a ranking measure of each variable's contribution.

PAsso (R Package)

An R package and implementation of the unified framework for assessing Parrtial Association between ordinal variables after adjusting for a set of covariates.

SSCI (R Package)

An R Package for constructing the sparsified simultaneous confidence intervals (SSCI).

SPSP (R Package)

An R Package for Selecting the relevant predictors by Partitioning the Solution Paths of the Penalized Likelihood Approach.

Trading Strategies (App)

It is an online platform to display interactive trading strategies.

Stock Display (App)

It is an Shiny app that is used to display time-series data and forecasting of stocks.

Bankruptcy Prediction (App)

It is live app showing probability of bankruptcy of public companies in US.

Teaching

Data Mining for Business Analytics (Towson)
Instructor

This course emphasizes hands-on data analysis experience using the most recent progression in data mining and machine learning for business analytics with statistical software R. Topics include modern data wrangling techniques, data visualization, linear regression, logistic regression, variable selection, model evaluation, K-nearest neighbors, classification and regression trees (CART), etc.

Syllabus

Business Analytics (Towson)
Instructor

This course focuses on using standard business analytic models to summarize and analyze data, build models, and drive impact through quantitative decision-making.

Syllabus

Data Wrangling with R (UC)
Instructor

Data Wrangling with R! This course provides an intensive, hands-on introduction to Data Wrangling with the R programming language. You will learn the fundamental skills required to acquire, munge, transform, manipulate, and visualize data in a computing environment that fosters reproducibility.

Syllabus Lab Notes

Forecasting and Time Series Methods (UC)
Instructor

This course covers time series analysis, emphasizing the appropriate models for estimation, testing, and forecasting. For example, Univariate Box-Jenkins for fitting and forecasting time series; ARIMA models, stationarity and nonstationarity; diagnosing time series models; forecasting, point, and interval forecasts, seasonal time series models, modeling volatility with ARCH, GARCH, , and other methods. The R Shiny App development is also covered to help students obtain skills in making a prototype of their models and ideas.

Undergraduate Graduate

Business Analytics (UC)
Instructor

This course develops fundamental knowledge and skills for applying statistics to business decision-making. Topics include descriptive statistics, probability distributions, sampling, confidence intervals, hypothesis testing, and computer software for statistical applications. (2018 Spring & Fall)

Syllabus

Data Mining I & II (UC)
Instructor

The statistical methods in these two courses include Linear Regression, Generalized Linear Models (e.g., Logistic regression), Variable Selection, Cross Validation, k-nearest neighbors, Classification and Regression Trees (CART), Bagging, Boosting, Random Forests, Generalized Additive Models (GAM for Nonlinearity), Nonparametric Smoothing; Neural Network, Clustering(K-means clustering, Support Vector Machine), Principal Component Analysis, Association Rules, and Text Mining.

Syllabus Lab Notes

About Me