People are exposed to thousands of chemicals every day — through the products they use, the food they eat and the environments they live in — but only a fraction of those chemicals have been fully tested for safety.
Researchers at the Texas A&M College of Veterinary Medicine and Biomedical Sciences (VMBS) are turning to artificial intelligence to help close that gap, using new tools to predict chemical toxicity and determine how much those predictions can be trusted.
The work builds on a recent study published in Nature Communications that explores how artificial intelligence can predict chemical toxicity while also estimating how reliable those predictions are.
Dr. Weihsueh Chiu, a professor in VMBS’ Department of Veterinary Physiology and Pharmacology, is leading efforts to advance these tools and apply them to better understand chemical safety and risk.
People are also reading…
"With the artificial intelligence tools we’re developing, we now have a way to estimate which exposure levels are unlikely to cause harm," Chiu said. "These tools could play a key role in regulatory decision-making, helping regulators identify which substances require further testing, stricter regulation or removal from the market."
A longstanding problem in toxicology
Traditionally, for scientists to determine whether a chemical is safe, they have relied on animal studies or human epidemiological research — studies that track how chemicals affect people over time — but both are time-consuming, expensive and limited in scope.
"With rodents, there’s not enough time or resources to test everything," Chiu said. "For human studies, people are already getting sick by the time those effects are identified."
This creates a massive data gap between the number of chemicals in commerce and those with reliable safety data — leaving many substances largely unstudied.
To address this, researchers have spent the past decade developing machine learning models — known as quantitative structure-activity relationship (QSAR) models — that use a chemical’s structure to estimate safe exposure levels.
But while these models can generate predictions, one major limitation has been transparency.
Many traditional systems operate as "black boxes," producing answers without explaining how they were reached — making them difficult for regulators and scientists to trust.
Chiu has previously helped address this issue through a two-stage machine learning framework designed to make predictions more interpretable.
Specifically, instead of relying on abstract molecular descriptors, the model uses familiar, real-world properties — such as water solubility, biodegradability and toxicity indicators — to determine how these characteristics may influence its potential health effects.
This approach allows risk assessors to better understand why a prediction was made and not just what the prediction is.
The key innovation: Knowing what you don’t know
More recently, Chiu and collaborators expanded this work to include so-called "uncertainty-aware" machine learning, an approach that estimates how reliable each prediction is.
"We want these machine learning models to not only predict a number but also show how confident they are in that prediction," Chiu said. "That confidence depends on how much existing data the model has to draw from."
"Predictions are more reliable when similar chemicals have been studied and more uncertain when data is limited," he said. "This can help researchers identify which chemicals may require closer attention."
For example, two chemicals may appear equally toxic on paper, but one prediction may be far less certain — meaning the potential risk could be much higher.
"Just because two chemicals have the same prediction doesn’t mean they carry the same worst-case risk," Chiu said.
To capture this, the models generate a range of possible outcomes for each prediction, showing how certain — or uncertain — the results are.
When applied to more than 126,000 chemicals, these models revealed important patterns — not just in toxicity, but also in uncertainty.
Certain groups of chemicals — including metals, polychlorinated compounds and PFAS — showed higher levels of uncertainty, often due to limited data or complex chemical behavior that makes them harder to model.
"These insights can help us pinpoint where more research is needed and where to focus those efforts," Chiu said.
Rather than chasing the latest chemical of concern, this approach allows scientists to systematically identify where the biggest knowledge gaps exist across the entire chemical landscape.
From prediction to decision-making
For the researchers, using machine learning to identify safe or unsafe substances is just one part of the solution; through uncertainty estimates, researchers can also determine when human expertise is still needed.
Chiu described this as a tiered approach — using AI for large-scale screening while reserving expert review for high-risk or highly uncertain cases.
Although challenges remain — including limited data and reliance on previously conducted animal-based studies — the integration of AI marks a significant step forward.
As these tools continue to evolve, they could fundamentally change how scientists — and regulators — approach chemical safety, shifting from reactive testing to proactive prediction.

