May 2020

Machine Learning, the Treasury Yield Curve and Recession Forecasting

Michael Puglia and Adam Tucker


We use machine learning methods to examine the power of Treasury term spreads and other financial market and macroeconomic variables to forecast US recessions, vis-à-vis probit regression. In particular we propose a novel strategy for conducting cross-validation on classifiers trained with macro/financial panel data of low frequency and compare the results to those obtained from standard k-folds cross-validation. Consistent with the existing literature we find that, in the time series setting, forecast accuracy estimates derived from k-folds are biased optimistically, and cross-validation strategies which eliminate data "peeking" produce lower, and perhaps more realistic, estimates of forecast accuracy. More strikingly, we also document rank reversal of probit, Random Forest, XGBoost, LightGBM, neural network and support-vector machine classifier forecast performance over the two cross-validation methodologies. That is, while a k-folds cross-validation indicates tha t the forecast accuracy of tree methods dominates that of neural networks, which in turn dominates that of probit regression, the more conservative cross-validation strategy we propose indicates the exact opposite, and that probit regression should be preferred over machine learning methods, at least in the context of the present problem. This latter result stands in contrast to a growing body of literature demonstrating that machine learning methods outperform many alternative classification algorithms and we discuss some possible reasons for our result. We also discuss techniques for conducting statistical inference on machine learning classifiers using Cochrane's Q and McNemar's tests; and use the SHapley Additive exPlanations (SHAP) framework to decompose US recession forecasts and analyze feature importance across business cycles.


PDF: Full Paper

Back to Top
Last Update: May 20, 2020