September 2025

Parallel Trends Forest: Data-Driven Control Sample Selection in Difference-in-Differences

Yesol Huh and Matthew Vanderpool Kling

Abstract:

This paper introduces parallel trends forest, a novel approach to constructing optimal control samples when using difference-in-differences (DiD) in a relatively long panel data with little randomization in treatment assignment. Our method uses machine learning techniques to construct an optimal control sample that best meet the parallel trends assumption. We demonstrate that our approach outperforms existing methods, particularly with noisy, granular data. Applying the parallel trends forest to analyze the impact of post-trade transparency in corporate bond markets, we find that it produces more robust estimates compared to traditional two-way fixed effects models. Our results suggest that the effect of transparency on bond turnover is small and not statistically significant when allowing for constrained deviations from parallel trends. This method offers researchers a powerful tool for conducting more reliable DiD analyses in complex, real-world settings.

Keywords: causal inference, difference-in-differences, parallel trends assumption, random forest

DOI: https://doi.org/10.17016/FEDS.2025.091

PDF: Full Paper

Disclaimer: The economic research that is linked from this page represents the views of the authors and does not indicate concurrence either by other members of the Board's staff or by the Board of Governors. The economic research and their conclusions are often preliminary and are circulated to stimulate discussion and critical comment. The Board values having a staff that conducts research on a wide range of economic topics and that explores a diverse array of perspectives on those topics. The resulting conversations in academia, the economic policy community, and the broader public are important to sharpening our collective thinking.

Back to Top
Last Update: September 29, 2025