An implementation of the feature Selection procedure by Partitioning the entire Solution Paths (namely SPSP) to identify the relevant features rather than using a single tuning parameter. By utilizing the entire solution paths, this procedure can obtain better selection accuracy than the commonly used approach of selecting only one tuning parameter based on existing criteria, cross-validation (CV), generalized CV, AIC, BIC, and EBIC (Liu, Y., & Wang, P. (2018)). It is more stable and accurate (low false positive and false negative rates) than other variable selection approaches. In addition, it can be flexibly coupled with the solution paths of Lasso, adaptive Lasso, ridge regression, and other penalized estimators.
Details
This package includes two main functions and several functions (fitfun.SP) to obtains
the solution paths. The SPSP
function allows users to specify the penalized likelihood
approaches that will generate the solution paths for the SPSP procedure. Then this function
will automatically partitioning the entire solution paths. Its key idea is to classify variables
as relevant or irrelevant at each tuning parameter and then to select all of the variables
which have been classified as relevant at least once. The SPSP_step
purely apply the
partitioning step that needs the solution paths as the input. In addition, there are several
functions to obtain the solution paths. They can be used as an input of fitfun.SP
argument.
References
Liu, Y., & Wang, P. (2018). Selection by partitioning the solution paths. Electronic Journal of Statistics, 12(1), 1988-2017. <10.1214/18-EJS1434>
Author
Xiaorui (Jeremy) Zhu, zhuxiaorui1989@gmail.com,
Yang Liu, yliu23@fhcrc.org,
Peng Wang, wangp9@ucmail.uc.edu