Updated 26/02/2025

interTwin @ KubeCon: Managing and testing containers for AI Digital Twins on a supercomputer

Image

On April 4th, Diego Ciangottini (CERN) and Matteo Bunino (INFN) will present interTwin at KubeCon 2025 in London, UK.

From the event website:

CERN is advancing the development of AI-based digital twins in science through projects like interTwin, an EC-funded project to develop a digital twin engine for science. These digital twins rely on HPC resources for training multi-node, multi-GPU models using containerized workflows.
Developing such containers for HPC systems presents unique challenges, including accessing restricted HPC resources and integrating with HPC software stacks, while ensuring the interoperability between different container runtimes.
We introduce a CI/CD workflow that bridges cloud and HPC and enables automated testing of AI/ML containers on the same SLURM-managed clusters where they will be deployed. By integrating Dagger’s reproducible CI runtime with HPC offloading, this approach validates both the software in the containers and their compatibility with HPC environments. This ensures the seamless deployment of AI-based digital twins, addressing the critical need for robust testing in hybrid environments.