Document Type
Article
Publication Date
6-26-2026
Abstract
We apply machine learning methods to predict Thoroughbred yearling auction prices at the Keeneland September Sale (2020–2024). Our sample includes 5,788 yearling prices with pedigree data. We use both linear and tree-based models to predict log prices. We use cross-validation to tune model hyperparameters and select Ridge regression (α = 1.451) as the primary model for interpretation given its stability and interpretability. The Ridge regression explains approximately 54% of out-of-sample variation (R2≈ 0.5403). Sire and Dam Reputation emerge as the dominant predictors. Results provide pricing benchmarks and show how reputation and session structure shape Thoroughbred yearling auction prices.
Recommended Citation
Yang Y, Clarke J, Mandal T, Nguyen T, Pham T. Predicting Thoroughbred Yearling Auction Prices with Machine Learning: Evidence from the Keeneland September Sale. Journal of Agricultural and Applied Economics. Published online 2026:1-19. doi:10.1017/aae.2026.10050
Comments
JEL classifications: C45; C55; G12; Q19; L83
This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted re-use, distribution and reproduction, provided the original article is properly cited.