
Murali Emani. Image: Argonne National Laboratory
Emani discusses his contributions to projects like Trillion Parameter Consortium and AuroraGPT, and the role AI is playing in accelerating scientific discovery.
As a computer scientist in the Artificial Intelligence and Machine Learning (AI/ML) group at the Argonne Leadership Computing Facility (ALCF), Murali Emani helps to advance the forefront of using artificial intelligence (AI) for science. He co-leads the ALCF AI Testbed, where he evaluates novel AI accelerators and collaborates with researchers to integrate AI methods into complex scientific workflows.
Emani also contributes to broader efforts like the Trillion Parameter Consortium and the AuroraGPT project, where he helps shape strategies for training science-focused foundation models on state-of-the-art supercomputers such as ALCF’s Aurora exascale system. As a regular contributor to computing conferences, he’ll be presenting at ISC25 in June and remains deeply engaged in community-building through training programs and collaborative research with academia, national labs, and industry.
In this Q&A, Emani shares his perspective on AI hardware evaluation, multi-institutional research collaborations, and the growing role of AI in accelerating scientific discovery. The ALCF is a DOE Office of Science user facility at Argonne National Laboratory.
Q: Can you describe your role within the ALCF’s AI/ML group and the focus of your work?
I am a computer scientist in the AI/ML group, supporting ALCF users and computing resources through various research and development efforts. As a co-lead of the AI Testbed, I focus on performance evaluation of novel AI accelerators and collaborate closely with domain scientists to port and optimize their applications for these emerging platforms. I also worked on software stack evaluation for the Aurora supercomputer, engaging with Intel collaborators and leading several acceptance test applications to ensure the system meets the needs of the scientific workloads employed by our user community. In addition, I lead multiple efforts aimed at advancing the state-of-the-art areas such as large language model (LLM) optimization and performance tuning at scale across diverse hardware platforms. I regularly share our work through training sessions, hackathons, Birds of a Feather events, and presentations at conferences like SC and ISC. I also mentor postdocs, staff, and summer students, supporting both their technical growth and professional development. My work is enriched by collaborations with outstanding researchers from universities, HPC centers, and industry.
Q: Can you describe your role with the ALCF AI Testbed and how these systems are supporting AI-driven science?
I have been the co-lead of the ALCF AI Testbed since its inception nearly four years ago. The AI Testbed hosts a range of novel AI accelerators that differ significantly from traditional
GPU-based systems—for instance, Cerebras features a wafer-scale engine with on-chip 40GB memory, while SambaNova offers accelerators with up to 1.5TB of memory. My focus is to explore how the unique capabilities of these systems can advance scientific research. This involves close collaboration with hardware vendors to evaluate their platforms, provide technical feedback, and inform their roadmaps, and with domain scientists to understand AI integration in their workflows and assist in porting and optimizing models on these platforms. I also engage with emerging vendors at conferences like SC and ISC to explore potential additions to our Testbed portfolio. As part of the National Artificial Intelligence Research Resource (NAIRR) pilot, these resources are accessible to the broader academic community. I actively support several NAIRR award recipients in leveraging our systems. In parallel, I lead a variety of training and outreach efforts for the AI Testbed, including multi-day vendor-led workshops, a session in Argonne Training Program on Extreme-Scale Computing (ATPESC), and at major conferences such as SC.
Q: What is your involvement in the Trillion Parameter Consortium?
I have been actively involved in the TPC since its inception and currently co-lead the Model Architecture and Performance Evaluation (MAPE) group, which includes members from DOE labs, academia, and industry. Our goal is to foster collaboration on topics such as efficient AI model training and inference strategies, architectural innovations, and optimization of mixture-of-experts models. A central focus of the group is sharing insights and best practices based on hands-on experience with diverse supercomputing platforms, including Aurora, Fugaku, and LUMI.
Q: Could you share the goals of the AuroraGPT project and your role in developing the model?
The AuroraGPT project aims to develop a foundation model tailored for science, designed to support a wide range of AI applications across multiple scientific domains. These models are built to handle complex, multimodal data—including text, images, time series, and scientific code—enabling breakthroughs in areas such as materials science and biology. I am part of the "Models" group, where my primary focus is on optimizing large-scale training runs on the Aurora supercomputer. This includes improving performance, scalability, and resource utilization. I also collaborate closely with my peers to share best practices, troubleshoot challenges, and ensure efficient use of Aurora’s unique architecture for training foundation models.
Q: The “MProt-DPO” study, which you co-authored, was a finalist for the Gordon Bell Prize last year. What made this project stand out for you?
This project explores the use of Direct Preference Optimization (DPO) for protein design, achieving over 1 exaflop of sustained performance in mixed precision across five diverse supercomputers. It represents a major collaborative effort between domain scientists and our team, combining deep scientific expertise with advanced AI and performance engineering. This recognition is especially meaningful to me, as it highlights the potential of cutting-edge AI methods including fine-tuning techniques to tackle critical scientific challenges. Beyond the scientific outcomes, the project also involved developing a robust and scalable workflow that enabled efficient use of some of the world’s most powerful computing platforms. It demonstrates how the integration of AI methods with software and framework-level optimizations can significantly accelerate scientific applications at scale.
Q: You’ll be leading and participating in a few sessions at ISC25. What can attendees expect to learn?
At ISC25, I will be leading a tutorial on programming AI accelerators in the ALCF AI Testbed. The session will provide a detailed overview of the hardware and software characteristics of these emerging systems and guide attendees on how to get started with running various AI models. The goal is to help participants understand the unique capabilities of non-traditional (non-GPU-based) architectures and how they can be effectively leveraged for scientific applications. In addition, I will participate in two Birds of a Feather sessions:
“Leveraging Diverse AI Accelerators for Traditional (Non-AI) Applications” will explore the growing interest in repurposing AI hardware for broader workloads and discuss challenges related to programming models, compiler support, and software stack readiness.
“Challenges and Opportunities in Coupled HPC + AI Workflows” will address the integration of AI with traditional HPC workflows, highlighting current bottlenecks and emerging solutions.
Q: From your perspective, how is AI transforming scientific research, and what opportunities or challenges do you see ahead?
AI is revolutionizing scientific research by accelerating data analysis, enabling predictive modeling, and automating repetitive tasks, which boosts discovery speed. Opportunities include uncovering hidden patterns and enhancing interdisciplinary collaboration. However, challenges remain in ensuring data quality, interpretability of AI models, and addressing ethical concerns around bias and accountability.