Scientists come to the U.S. Department of Energy (DOE) national laboratories to solve big problems. Increasingly, these scientists are turning to artificial intelligence (AI) and machine learning (ML) to help them answer scientific questions. As AI and ML continue to scale and advance, so does the complexity of running them on supercomputers and distributed computing networks.
Scientists at the DOE’s Argonne National Laboratory are tackling this challenge by modeling, simulating, predicting and optimizing the performance of workflows. These workflows orchestrate and manage large computational and data science applications running on supercomputers connected by large data-transfer networks, such as those connecting the DOE’s national laboratories, user facilities, and data storage centers across the country. A new project funded by the DOE, PosEiDon: Platform for Explainable Distributed Infrastructure, is turning to AI and ML to improve the performance of these workflows.
“By optimizing science workflows that run on distributed computing and data infrastructure, we will be able to accelerate scientific discovery,” said Prasanna Balaprakash, a computer science leader at Argonne whose research focuses on data-efficient machine learning methods for scientific applications. The results could speed up the discovery of new battery materials, aid in the exploration of the universe, advance the science of nuclear physics and improve climate simulations.