JSC Accelerating Devices Lab

Notes from the Accelerating Devices Lab, an ATML at Jülich Supercomputing Centre. Please handle with care.

Posts

Mar 21, 2025 by Adel
Accelerating Graph Similarity Calculations with FAS-GED
Poster in institute repository: https://doi.org/10.34734/FZJ-2024-06811 Graphs are powerful tools for representing real-world objects and relations in numerous domains, such as bioinformatics, pattern recognition, and computer vision. However, quantifying their similarity or difference is crucial despite the computationally expensive execution. The Graph Edit Distance (GED) is a popular metric that measures...
Jan 9, 2025 by Carolin
Wrapping up OpenGPT-X: Using Supercomputers to Train a European Language Model
Since early 2022, the Accelerated Devices lab has been involved in the OpenGPT-X project.1 OpenGPT-X trains large language models to enable new data-driven business solutions and specifically address European needs. As of January 2025, the project has published its main results and is set to wrap up in early 2025....
Jan 7, 2025 by Chelsea
Advancing AI Benchmarking in HPC: Introducing CARAML and JPWR at SC24
From November 17th to 22nd, 2024, HPC professionals and researchers gathered in Atlanta, Georgia, for the Supercomputing Conference 2024. We presented a paper at the 2024 International Workshop on Performance, Portability, and Productivity in HPC where we introduced CARAML, a reproducible AI benchmarking framework, and jpwr, a custom energy assessment...
Dec 2, 2024 by Andreas
Paper: Application-Driven Exascale: The JUPITER Benchmark Suite
At the end of 2023, the contract was signed to build JUPITER, a new European supercomputer and the first to reach 1 ExaFLOP/s HPL performance. The winning bid was for a machine consisting of two parts (modules): JUPITER Booster, which uses 24 000 NVIDIA Grace-Hopper superchips to reach the ExaFLOP/s; and...
Jun 12, 2024 by Andreas
Paper: Many Cores, Many Models – GPU Programming Model vs. Vendor Compatibility Overview
In November of 2022, I created a table comparing GPU programming models and their support on GPUs of the three vendors (AMD, Intel, NVIDIA) for a talk. The audience liked it, so I beefed it up a little and posted it in this very blog. People still liked it, so...
Dec 12, 2023 by Chelsea
SC23 WHPC Workshop Paper: OpenGPT-X – Novel Architecture Exploration
The Supercomputing Conference 2023 took place in Denver, Colorado, from November 12th to 17th. For the Women in HPC workshop, we submitted a paper, which focused on benchmarking different accelerators for AI. The paper was accepted and I was invited to hold a lightning talk to show the work, spun...
Dec 5, 2023 by Stepan
Poster: GEMMERATOR – GEMM Kernel Generator
Poster in institute repository: http://dx.doi.org/10.34734/FZJ-2023-03437 During the RISC-V Summit Europe 2023 in Barcelona we presented our work generating highly optimized RISC-V and ARM GEMM microkernels for BLIS using a custom software tool.1 We presented results on the Fujitsu A64FX processor, the in-development RISC-V VEC processor from the EUPILOT project using...
Nov 29, 2023 by Jayesh
ISC23 Project Poster: SCALABLE: Scalable Lattice-Boltzmann Leaps to Exascale
Poster in institute repository: http://dx.doi.org/10.34734/FZJ-2023-04519 At the ISC High Performance Conference 2023 a little while ago, we presented a project poster on the SCALABLE project, embedded at the end of this post. The work was also presented as a paper at the the Computing Frontiers 2023 conference previously. Overview SCALABLE...
Jul 18, 2023 by Chelsea
MPI as API: Using UCC's NCCL Backend for MPI's Allreduce
Environment Setup Enabling UCC in OpenMPI Enabling NCCL in UCC (Team Layer Selection) All The Variables Results 1. Plain OpenMPI 2. OpenMPI with UCC 3. OpenMPI with UCC+NCCL Scaling Plots Average Latency Bus Bandwidth Comparing MPI, UCC, UCC+NCCL Comparing UCC+NCCL, NCCL Summary Technical Details This post showcases how to use...
Jul 3, 2023 by Andreas
Helmholtz GPU Hackathon 2023
Together with our colleagues from Helmholtz-Zentrum Dresden-Rossendorf (HZDR) and in collaboration with HIDA and OpenHackathons, we hosted the Helmholtz GPU Hackathon 2023 in Jülich in May. I’ve blogged about the event for the Zweikommazwei blog of Forschungszentrum; it even includes some video statements we filmed with a few participants! It...
Jun 28, 2023 by Sebastian and Andreas
MSA Introduction Workshop
MSA Concept MSA Software Building Blocks Workshop Exercises 1: Hello World! 2. GPU Hello World! 3: CPU-GPU Ping Pong Slides On May 29, we held a workshop about using the Modular Supercomputing Architecture (MSA) together with project partners from ParTec. The audience were collaborators from two projects funded in the...
May 26, 2023 by Chelsea
ISC23 Project Poster: OpenGPT-X – Training Large Language Models on HPC Systems
Poster publication: http://hdl.handle.net/2128/34532 The ISC High Performance Conference 2023 was held at Hamburg, Germany from 21st May to 25th May. At the conference, we presented a project poster on the OpenGPT-X project, outlining the progress and initial exploration results. The poster was even featured in HPCWire’s May 24 recap of...
Nov 2, 2022 by Andreas
GPU Vendor/Programming Model Compatibility Table
For a recent talk at DKRZ in the scope of the natESM project, I created a table summarizing the current state of using a certain programming model on a GPU of a certain vendor, for C++ and Fortran. Since it lead to quite a discussion in the session, I made...
Oct 10, 2022 by Andreas
Talk: Introduction to HPC
TL;DR: I held a HPC intro talk. Slides are below. In MAELSTROM, we connect three areas of science: 🌍Weather and climate simulation with 🤖Machine Learning methods and workflows using 📈HPC techniques and resources. Halfway into the project, we held a boot camp at JSC to teach this Venn diagram to...
Oct 6, 2022 by Carolin
Poster: OpenGPT-X - Training Large Language Models on HPC Systems
Poster publication: http://hdl.handle.net/2128/32006 The 14th JLESC workshop (JLESC: Joint Laboratory for Extreme-Scale Computing) was hosted by the National Center for Supercomputing Applications (NCSA) in Urbana, Illinois from 28th September to 30th September. We had the opportunity to present the OpenGPT-X project in form of a poster. On it, you can...
Oct 5, 2022 by Andreas
DOIng it Right! (DOIs for This Blog)
1This blog is an experiment. We want to share bits and pieces of our work; the reports we write, the presentations we hold, or the little discoveries we make, or even some first, water-testing investigations; and all the rest. It’s a documentation of what we do. Little bits of science,...
Aug 1, 2022 by Andreas
First Benchmarks with AMD Instinct MI250 GPUs at JSC
A few months ago, we extended the JURECA Evaluation Platform1 at JSC by two nodes with AMD Instinct MI250 GPUs (four GPUs each). The nodes are Gigabyte G262-ZO0 servers, each with a dual socket AMD EPYC 7443 processor (24 cores per socket, SMT-2) and with four MI250 GPUs (128 GB...
Jul 21, 2022 by Albert
OOPS Version 1 Release
A few days ago, OPTIMA announced the release of deliverable 3.5, to which I contributed. This deliverable is part of a set of five deliverables under work package 3. But first, let’s talk about OPTIMA. OPTIMA is an EU-funded project whose goal is to prove that several HPC applications can...
Jul 13, 2022 by Carolin
A mathematician's introduction to transformers and large language models
About This blog post is based on a presentation I held at the “New Trends in Computational Science in Engineering and Industrial Mathematics” workshop in Magdeburg on 01/07/2022. My goal is to give a brief introduction to the state of current large language models, the OpenGPT-X project, and the transformer...
Jun 29, 2022 by Andreas
10 Year Anniversary Workshop of NVIDIA Application Lab at Jülich
On June 21 and 22, we held a workshop looking back on the last ten years of our lab together with NVIDIA – the NVIDIA Application Lab at Jülich (or NVLab, as I sometimes abbreviate it). The material can be found in the agenda at Indico. We invited a set...
Jun 24, 2022 by Andreas
X-Dev Website at fz-juelich.de
At the end of May, Forschungszentrum Jülich got a new website. Finally, the group also has a dedicated page there, listing the things we do, the project we are part of, and all the members in the group. And thanks to a photo shooting by Sebastian, we have great portraits...
Jun 24, 2022 by Stepan
MAELSTROM: First benchmarks at ISC22
On June 1, in the context of the AI and HPC for Weather and Climate Session at ISC2022, we presented the very first benchmark results for the MAELSTROM Machine Learning applications. The work presented highlights of some of the results obtained in the MAELSTROM Deliverable D3.4 and was shown following...
Jun 2, 2022 by Andreas
Multi-GPU Computing Tutorial at ISC22
On May 29, we held the first in person tutorial since the start of the Covid pandemic. And while it was a little weird being back physically among people, it was also great at the same time. Gone are the challenges of teaching in video conferences, albeit also all the...
May 29, 2022 by Andreas
Hello, X-Dev Blog!
Welcome to the X-Dev Blog, the blog of the Accelerating Devices Lab at Jülich Supercomputing Centre of Forschungszentrum Jülich. We are an ATML, focusing on improving usage of GPUs and other accelerators, in close collaborations with our HPC users. We are taking part in different third-party-funded project, in which we...