Wrapping up OpenGPT-X: Using Supercomputers to Train a European Language Model
Since early 2022, the Accelerated Devices lab has been involved in the OpenGPT-X project.1 OpenGPT-X trains large language models to enable new data-driven business solutions and specifically address European needs. As of January 2025, the project has published its main results and is set to wrap up in early 2025. In this blog post, we aim to share updates from recent events, highlight the project’s outcomes, and explain the critical role played by the HPC capabilities of JSC.
About OpenGPT-X
OpenGPT-X is a collaboration between ten consortium partners from industry, research and media. When the project began in the pre-ChatGPT era, language models had not yet been widely adopted or integrated into everyday life. Researchers in the field, however, were already observing the impressive capabilities of models like GPT-3. OpenGPT-X encompasses the entire lifecycle of a language model, which included input data collection and curation, pre-training from scratch, fine-tuning, evaluation, and prototyping applications. Many consortium partners supported these efforts with accompanying research activities.
Our role in the project
JSC played a key role in two central work packages of the OpenGPT-X project. Our lab led the work package responsible for providing the GPU-based computational infrastructure essential for training the models. For this purpose, JSC’s flagship supercomputer, JUWELS Booster, was utilized to pre-train the language models, employing up to 128 nodes with a total of 512 Nvidia A100 GPUs. Additional computational resources for tasks such as fine-tuning, experimentation, evaluation and inference were provided by consortium partners IONOS and ZIH (TU Dresden).
In addition to providing technical support and ensuring smooth operations, we conducted research on benchmarking novel hardware architectures and optimization strategies. We contributed co-design input to ensure that JUPITER, Europe’s soon-to-be largest supercomputer currently under deployment in Jülich, is well-suited for AI workloads.2
Our efforts were closely integrated with the Applied Machine Learning lab at JSC, which took a leading role in the core work package focused on developing large language models.
The OpenGPT-X Forum 2024
On November 4, 2024, the OpenGPT-X Forum at the Forum Digitale Technologien in Berlin brought project partners together to share their results with the general public. Our lab was represented by Chelsea and Caro, who showcased some of our contributions to the project. Chelsea presented her work on the CARAML benchmark suite3, which evaluates the training capabilities of various hardware accelerators in terms of throughput and energy efficiency. Find more information in her CARAML blog post. Caro delivered a talk on promising research directions related to efficient low-rank computations to reduce memory requirements in deep learning.
The event featured research presentations, a poster session, and live demonstrations, creating an engaging atmosphere and highlighting the project’s progress in developing multilingual AI “Made in Germany.”
The Teuken-7B Model
The publication of our language model Teuken 7B Instruct, on November 26, 2024, marked a significant milestone for the project. Teuken-7B is a European-focused multilingual large language model emphasizing all 24 official EU languages to reflect Europe’s linguistic and cultural diversity. It employs a custom multilingual tokenizer4 to optimize efficiency and has been trained with over 50% non-English data. The model’s capabilities are demonstrated on the European LLM Leaderboard, an evaluation platform developed as part of the OpenGPT-X initiative.
Looking Ahead
The OpenGPT-X project provided an invaluable opportunity to explore the transformative potential of large language models. It offered us deep insights into every aspect of training neural networks on supercomputers, from solving unique technical challenges to their integration into the broader LLM ecosystem. Armed with this expertise, we are excited to advance our research endeavors. Chelsea is leveraging her knowledge in her PhD research on Fourier Neural Operators for differential equation solvers. Meanwhile, Caro is eager to continue exploring low-rank methods to enhance the efficiency of deep learning. Both will incorporate hardware-specific features into their research.
As OpenGPT-X wraps up, we remain focused on applying its outcomes to future research, using the insights gained to tackle new challenges and advance AI development.
OpenGPT-X is funded by the Federal Ministry for Economic Affairs and Climate Action (BMWK) of Germany for the period 2022-2025. Compute time on the GCS Supercomputer JUWELS Booster at JSC is provided through the Gauss Centre for Supercomputing e.V.