Selection of Representative Constituents for Unknown, Variable, Complex, or Biological Origin Substance Assessment Based on Hierarchical Clustering
Environmental Toxicology and Chemistry (2021)
Many of the newly produced and registered substances are complex mixtures or substances of unknown or variable composition, complex reaction products, and biological materials (UVCBs). The latter often consist of a large number of constituents, some of them difficult-to-identify constituents, which complicates their (eco)toxicological assessment. In the present study, through a series of examples, different scenarios for selection of representatives via hierarchical clustering of UVCB constituents are exemplified. Hierarchical clustering allows grouping of the individual chemicals into small sets, where the constituents are similar to each other with respect to more than one criterion. To this end, various similarity criteria and approaches for selection of representatives are developed and analyzed. Two types of selection are addressed: (1) selection of the most "conservative" constituents, which could be also used to support prioritization of UVCBs for evaluation, and (2) obtaining of a small set of chemical representatives that covers the structural and metabolic diversity of the whole target UVCBs or a mixture that can then be evaluated for their environmental and (eco)toxicological properties. The first step is to generate all plausible UVCB or mixture constituents. It was found that the appropriate approach for selecting representative constituents depends on the target endpoint and physicochemical parameters affecting the endpoint of interest