Notes from the Accelerating Devices Lab, an ATML at Jülich Supercomputing Centre. Please handle with care.
Posts
Paper: Application-Driven Exascale: The JUPITER Benchmark Suite
At the end of 2023, the contract was signed to build JUPITER, a new European supercomputer and the first to reach 1 ExaFLOP/s HPL performance. The winning bid was for a machine consisting of two parts (modules): JUPITER Booster, which uses 24 000 NVIDIA Grace-Hopper superchips to reach the ExaFLOP/s; and...Paper: Many Cores, Many Models – GPU Programming Model vs. Vendor Compatibility Overview
In November of 2022, I created a table comparing GPU programming models and their support on GPUs of the three vendors (AMD, Intel, NVIDIA) for a talk. The audience liked it, so I beefed it up a little and posted it in this very blog. People still liked it, so...SC23 WHPC Workshop Paper: OpenGPT-X – Novel Architecture Exploration
The Supercomputing Conference 2023 took place in Denver, Colorado, from November 12th to 17th. For the Women in HPC workshop, we submitted a paper, which focused on benchmarking different accelerators for AI. The paper was accepted and I was invited to hold a lightning talk to show the work, spun...Poster: GEMMERATOR – GEMM Kernel Generator
Poster in institute repository: http://dx.doi.org/10.34734/FZJ-2023-03437 During the RISC-V Summit Europe 2023 in Barcelona we presented our work generating highly optimized RISC-V and ARM GEMM microkernels for BLIS using a custom software tool.1 We presented results on the Fujitsu A64FX processor, the in-development RISC-V VEC processor from the EUPILOT project using...ISC23 Project Poster: SCALABLE: Scalable Lattice-Boltzmann Leaps to Exascale
Poster in institute repository: http://dx.doi.org/10.34734/FZJ-2023-04519 At the ISC High Performance Conference 2023 a little while ago, we presented a project poster on the SCALABLE project, embedded at the end of this post. The work was also presented as a paper at the the Computing Frontiers 2023 conference previously. Overview SCALABLE...MPI as API: Using UCC's NCCL Backend for MPI's Allreduce
Environment Setup Enabling UCC in OpenMPI Enabling NCCL in UCC (Team Layer Selection) All The Variables Results 1. Plain OpenMPI 2. OpenMPI with UCC 3. OpenMPI with UCC+NCCL Scaling Plots Average Latency Bus Bandwidth Comparing MPI, UCC, UCC+NCCL Comparing UCC+NCCL, NCCL Summary Technical Details This post showcases how to use...Helmholtz GPU Hackathon 2023
Together with our colleagues from Helmholtz-Zentrum Dresden-Rossendorf (HZDR) and in collaboration with HIDA and OpenHackathons, we hosted the Helmholtz GPU Hackathon 2023 in Jülich in May. I’ve blogged about the event for the Zweikommazwei blog of Forschungszentrum; it even includes some video statements we filmed with a few participants! It...MSA Introduction Workshop
MSA Concept MSA Software Building Blocks Workshop Exercises 1: Hello World! 2. GPU Hello World! 3: CPU-GPU Ping Pong Slides On May 29, we held a workshop about using the Modular Supercomputing Architecture (MSA) together with project partners from ParTec. The audience were collaborators from two projects funded in the...ISC23 Project Poster: OpenGPT-X – Training Large Language Models on HPC Systems
Poster publication: http://hdl.handle.net/2128/34532 The ISC High Performance Conference 2023 was held at Hamburg, Germany from 21st May to 25th May. At the conference, we presented a project poster on the OpenGPT-X project, outlining the progress and initial exploration results. The poster was even featured in HPCWire’s May 24 recap of...GPU Vendor/Programming Model Compatibility Table
For a recent talk at DKRZ in the scope of the natESM project, I created a table summarizing the current state of using a certain programming model on a GPU of a certain vendor, for C++ and Fortran. Since it lead to quite a discussion in the session, I made...Talk: Introduction to HPC
TL;DR: I held a HPC intro talk. Slides are below. In MAELSTROM, we connect three areas of science: 🌍Weather and climate simulation with 🤖Machine Learning methods and workflows using 📈HPC techniques and resources. Halfway into the project, we held a boot camp at JSC to teach this Venn diagram to...Poster: OpenGPT-X - Training Large Language Models on HPC Systems
Poster publication: http://hdl.handle.net/2128/32006 The 14th JLESC workshop (JLESC: Joint Laboratory for Extreme-Scale Computing) was hosted by the National Center for Supercomputing Applications (NCSA) in Urbana, Illinois from 28th September to 30th September. We had the opportunity to present the OpenGPT-X project in form of a poster. On it, you can...DOIng it Right! (DOIs for This Blog)
1This blog is an experiment. We want to share bits and pieces of our work; the reports we write, the presentations we hold, or the little discoveries we make, or even some first, water-testing investigations; and all the rest. It’s a documentation of what we do. Little bits of science,...First Benchmarks with AMD Instinct MI250 GPUs at JSC
A few months ago, we extended the JURECA Evaluation Platform1 at JSC by two nodes with AMD Instinct MI250 GPUs (four GPUs each). The nodes are Gigabyte G262-ZO0 servers, each with a dual socket AMD EPYC 7443 processor (24 cores per socket, SMT-2) and with four MI250 GPUs (128 GB...OOPS Version 1 Release
A few days ago, OPTIMA announced the release of deliverable 3.5, to which I contributed. This deliverable is part of a set of five deliverables under work package 3. But first, let’s talk about OPTIMA. OPTIMA is an EU-funded project whose goal is to prove that several HPC applications can...A mathematician's introduction to transformers and large language models
About This blog post is based on a presentation I held at the “New Trends in Computational Science in Engineering and Industrial Mathematics” workshop in Magdeburg on 01/07/2022. My goal is to give a brief introduction to the state of current large language models, the OpenGPT-X project, and the transformer...10 Year Anniversary Workshop of NVIDIA Application Lab at Jülich
On June 21 and 22, we held a workshop looking back on the last ten years of our lab together with NVIDIA – the NVIDIA Application Lab at Jülich (or NVLab, as I sometimes abbreviate it). The material can be found in the agenda at Indico. We invited a set...X-Dev Website at fz-juelich.de
At the end of May, Forschungszentrum Jülich got a new website. Finally, the group also has a dedicated page there, listing the things we do, the project we are part of, and all the members in the group. And thanks to a photo shooting by Sebastian, we have great portraits...MAELSTROM: First benchmarks at ISC22
On June 1, in the context of the AI and HPC for Weather and Climate Session at ISC2022, we presented the very first benchmark results for the MAELSTROM Machine Learning applications. The work presented highlights of some of the results obtained in the MAELSTROM Deliverable D3.4 and was shown following...Multi-GPU Computing Tutorial at ISC22
On May 29, we held the first in person tutorial since the start of the Covid pandemic. And while it was a little weird being back physically among people, it was also great at the same time. Gone are the challenges of teaching in video conferences, albeit also all the...Hello, X-Dev Blog!
Welcome to the X-Dev Blog, the blog of the Accelerating Devices Lab at Jülich Supercomputing Centre of Forschungszentrum Jülich. We are an ATML, focusing on improving usage of GPUs and other accelerators, in close collaborations with our HPC users. We are taking part in different third-party-funded project, in which we...