Declarative Analytics on Heterogeneous Exascale Systems

Ahmedur Rahman Shovon, University of Chicago Illinois
Seminar
CELS seminar graphic featuring the title and date.

The emergence of exascale computing, driven largely by GPU acceleration, has transformed high-performance computing (HPC). Declarative languages like Datalog naturally benefit from this evolution, using simple recursive rules compiled efficiently into GPU-optimized relational algebra operations. Unlike SQL, Datalog iteratively executes queries until a fixed-point is reached, ideal for graph mining, deductive databases, and program analysis. Existing engines such as SLOG, LogicBlox, and Soufflé target multi-core architectures and lack support for multi-node, multi-GPU environments. Our research addresses this gap by developing the first multi-GPU, multi-node Datalog engine, combining CUDA for intra-node parallelism with MPI for inter-node communication. We introduce GPU-parallel implementations of relational joins, scalable recursive aggregation, and novel iterative all-to-all communication strategies. Evaluations on Argonne’s Polaris supercomputer achieved speedups up to 32× over state-of-the-art distributed Datalog engines, highlighting potential expansions into domains such as topological data analysis and visual analytics establishing a foundation for declarative analytics on future HPC platforms.