CATABOL 301B
Endpoint
The biodegradation CATABOL 301B model simulates aerobic biodegradation under OECD 301B test conditions (Modified Sturm test). Simulation of catabolism is based on single "preferred" pathway. The modeled endpoint is the percentage of theoretical CO2 release for 28 days (ThCO2, %).
Data
The training set of the model consists of 109 proprietary (Procter and Gamble) data with observed percentage of theoretical CO2 release for 28 days (ThCO2, %). The training set includes 83 readily biodegradable and 26 not readily biodegradable chemicals.
Another training database of catabolic pathways for more than 550 organic compounds and expert knowledge was used to determine the principal transformations and to train the system to simulate aerobic catabolism of training chemicals. The documented pathways of microbial catabolism were collected from scientific papers, monographs and databases accessible over the Internet.
Model
CATABOL 301B model consists of a metabolism simulator and an endpoint model. The microbial metabolism is simulated by the rule-based approach. The core parts of the simulator are a set of hierarchically organized transformations and a system of rules that control the application of these transformations. Recursive application of the transformations allows simulation of metabolism and generation of biodegradation pathways. Calculation of the modeled endpoint (ThCO2, %) is based on the simulated catabolic tree using the most probable biodegradation pathway and the material balance of transformations used to build the tree.
The development of the model consists of: (i) generation of metabolic maps for the training set chemicals using the microbial metabolism simulator; (ii) estimation of probabilities of occurrence of the simulator transformations. Non-linear least square fitting was used to parameterize the model:
where RSS is the residual sum of squares, ThCO2Obs and ThCO2Calc are observed and predicted percentage of theoretical CO2 release for 28 days for training chemicals and P is a vector of estimated probabilities of transformations. Further details on the mathematical formalism of the model can be reviewed in [1, 2].
Domain
The stepwise approach [3] was used to define the applicability domain of the model. It consists of the following sub-domain levels:
- General parametric requirements - includes ranges of variation log KOW and MW,
- Structural domain - based on atom-centered fragments (ACFs),
- Domain of simulator of metabolism - determines the reliability of the simulated metabolism.
A chemical is considered In Domain if its log KOW and MW are within the specified ranges, its ACFs are presented in the training chemicals and if the simulator contains transformations for its full mineralization. The information implemented in the applicability domain is extracted from the correctly predicted training chemicals used to build the model and in this respect the applicability domain determines practically the interpolation space of the model.
Performance
The goodness of fit evaluated by the squared coefficient of correlation is R2 = 0.73. The model correctly classified 88% of experimentally ready and 73% of not ready degradable training chemicals.
Reporting
The model provides results for:
- ThCO2, %,
- Quantities of parent and biodegradation products, mol/mol parent,
- Applicability domain details.
References
1. S.D. Dimitrov, J.S. Jaworska, N. Nikolova, and O.G.
Mekenyan, Probabilistic biodegradation modeling based on catabolic
pathways, 9th International Workshop on Quantitative Activity
Relationships in Environmental Sciences (QSAR'2000), Bourgas,
2000.
2. J. Jaworska, S. Dimitrov, N. Nikolova, and O. Mekenyan,
Probabilistic assessment of biodegradability based on metabolic
pathways: CATABOL system, SAR QSAR Environ. Res. 13 (2002), pp.
307-323
3. S. Dimitrov, G. Dimitrova, T. Pavlov, D. Dimitrova, G.
Patlevisz, J. Niemela and O. Mekenyan, J. Chem. Inf. Model.
45 (2005), pp. 839-849.