Sinkhole formation poses a significant geohazard in karst regions, where unpredictable subsurface erosion often necessitates costly grouting for stabilization. Accurate estimation of grout volume remains a persistent challenge due to spatial variability, site-specific conditions, and the limitations of traditional empirical methods. This study introduces a novel machine learning-based regression model for grout volume prediction that integrates cone penetration test (CPT)-derived Sinkhole Resistance Ratio (SRR) values, spatial correlations between CPT and grouting points (GPs), and field-recorded grout volumes from six sinkhole sites in Florida. Three data transformation methods, the Proximal Allocation Method (PAM), the Equitable Distribution Method (EDM), and the Threshold-based Equitable Distribution Method (TEDM), were applied to distribute grout influence across CPTs, with TEDM demonstrating superior predictive performance. Synthetic data augmentation using spline methodology further improved model robustness. A high-degree polynomial regression model, optimized with ridge regularization, achieved high accuracy (R 2 = 0.95; PEV = 0.94) and significantly outperformed existing linear and logarithmic models. Results confirm that lower SRR values correlate with higher grout demand, and the proposed model reliably captures these nonlinear relationships. This research advances sinkhole remediation practice by providing a data-driven, accurate, and generalizable framework for grout volume estimation, enabling more efficient resource allocation and improved project outcomes. • Development of an Advanced Grout Volume Prediction Model using Machine Learning • Integration of the Sinkhole Resistance Ratio (SRR) for Grout Volume Estimation • Comparison of Data Transformation Methods • Development of a Polynomial Regression Model with Enhanced Predictive Accuracy • Potential New Framework for Sinkhole Remediation