Nº 21-83: Structured Additive Regression and Tree Boosting
Structured additive regression (STAR) models are a rich class of regression models that include the generalized linear model (GLM) and the generalized additive model (GAM). STAR models can be fitted by Bayesian approaches, component-wise gradient boosting, penalized least-squares, and deep learning. Using feature interaction constraints, we show that such models can be implemented also by the gradient boosting powerhouses XGBoost and LightGBM, thereby benefiting from their excellent predictive capabilities. Furthermore, we show how STAR models can be used for supervised dimension reduction and explain under what circumstances covariate effects of such models can be described in a transparent way. We illustrate the methodology with case studies pertaining to house price modeling, with very encouraging results regarding both interpretability and predictive performance.