Title : Prediction of micro pollutants degradation kinetic constants by ultrasonic using XGBoost and SHAP models and verification through experiments
In recent years, the emissions of micro-pollutants have been continuously increasing, leading to the sustained significance of degrading micro-pollutants in aquatic systems. Among various degradation methods for micro-pollutants, the approach of ultrasonic degradation in aquatic environments has been garnering increasing attention. However, due to the wide variety of micro-pollutants, the complexity of their degradation pathways, and the diverse physicochemical properties of these pollutants, coupled with the influence of local conditions, determining local ultrasonic degradation methods has proven to be challenging. In order to enhance engineering efficiency and reduce trial-and-error costs, a straightforward and efficient method for predicting the performance of ultrasonic water treatment systems under various conditions is necessary. All training and testing data were collected from articles on ultrasonic purification of micro-pollutants between 1994 and 2022, proposes a prediction and feature analysis model for the ultrasonic degradation of micro-pollutant kinetic constants based on XGBoost. The relevant parameter variables include characteristics of ultrasound (frequency, power, power density, calorimetric power, and calorimetric power density), characteristics of pollutants (concentration, molar mass, Henry's constant, and logKow), and experimental conditions (pH, reaction volume, and temperature). Prior to the XGBoost modeling training, the raw data underwent preprocessing within the program. Utilizing the processed data as input for XGBoost, the model underwent iterative training to attain the optimal configuration. Various ratios (9:1, 8:2, 7:3, 6:4) of training and testing data were examined, with the 9:1 ratio yielding the highest accuracy and thus being chosen. SHAP values served to explain the model features, identifying the pivotal factors that influenced the kinetic constants. The model's prediction accuracy was further improved by clarifying the characteristics of the model through the SHAP value and re-dividing the scope of the critical factors affecting the kinetic constants. The results demonstrated that after iterative prediction, the model's MAPE, R2 , MAE, MSE, and RMSE reached 0.056%, 0.932, 0.321, 0.221, and 0.470, respectively. Subsequently, six different pollutants were randomly selected for ultrasonic experiments to acquire kinetic constants. Following this, a set of six diverse pollutants was randomly selected for ultrasonic experiments to derive kinetic constants. Subsequently, a comparison between these experimental findings and the machine learning predictions, conducted under identical conditions, verified the accuracy of the model. The observed alignment between experimentally validated results and predicted outcomes underscored a notable level of accuracy. These ongoing endeavors will be channeled into further exploration of optimal reaction conditions for a specific reactor.