Machine Learning-Based Prediction of Coagulant Dosing in Drinking Water Treatment Plants Using Polynomial Regression with Lasso Regularization | 박준홍 교수 연구실 | 동아대학교 기계공학과

박준홍 교수 연구실

서비스 플랜

연구실 검색

프로젝트 공고

정부 과제 추천

AI 기반 기업 서칭

홈

기본 정보

연구 분야

프로젝트

논문

구성원

article|

gold

·인용수 0

·2025

Machine Learning-Based Prediction of Coagulant Dosing in Drinking Water Treatment Plants Using Polynomial Regression with Lasso Regularization

Jusuk An, Joonhong Park, Seungjae Yeon, Changseog Oh, Bokjin Lee, Woo‐Sik Jung, Jeongmin Yun, Hyun Je Oh

IF 2.8Processes

초록

Coagulation is a critical unit process in drinking water treatment plants (DWTPs), where accurate dosing of coagulants such as polyaluminum chloride (PAC) and polyaluminum hydroxide chloride silicate (PACS) directly determines turbidity removal and operational stability. However, nonlinear interactions among water-quality variables complicate dosage prediction, and jar tests or operator heuristics cannot support real-time control. This study presents a scientifically interpretable and operationally transferable framework based on polynomial multiple linear regression (PMLR) with Lasso regularization, which was specifically developed for full-scale DWTP environments. While conventional PMLR rapidly overfits beyond polynomial degrees of 4–5, the Lasso-regularized model maintained stable generalization even at a degree of 10 by automatically pruning redundant terms and suppressing multicollinearity, thereby minimizing the need for manual hyperparameter tuning. Using 8303 hourly operational records from a full-scale DWTP in Korea, the Lasso-PMLR achieved R2 = 0.951, RMSE = 0.120, and MAPE = 7.02%, outperforming traditional linear regression (R2 = 0.896; MAPE = 8.64%). This proportional stability across increasing polynomial degrees, demonstrated directly using long-term real-world data, is particularly valuable for practical deployment because it ensures robustness without complex model-selection procedures. The transparent coefficient structure enables operators—who typically rely on jar tests—to understand and adjust dosing behavior, offering a field-ready and interpretable alternative to black-box models and supporting more efficient coagulant use, reduced sludge production, and sustainable automation in DWTP operation.

키워드

Polynomial regressionLinear regressionMulticollinearityPolynomialLasso (programming language)RegressionWater treatment

타입

article

IF / 인용수

2.8 / 0

원문

https://doi.org/10.3390/pr13123829

게재 연도

2025