Talk: Introduction to HPC

Oct 10, 2022 by Andreas DOI:10.34732/xdvblg-vgfsaz CC-BY-SA 4.0

TL;DR: I held a HPC intro talk. Slides are below.

In MAELSTROM, we connect three areas of science: 🌍Weather and climate simulation with 🤖Machine Learning methods and workflows using 📈HPC techniques and resources. Halfway into the project, we held a boot camp at JSC to teach this Venn diagram to a group of students a few days ago. Some were ML experts, but had never used a HPC system. Others came from climate science, but had never applied ML methods to the problem. Using the applications of MEALSTROM as examples, participants of the boot camp could hands-on learn about all these cool things - at once. In addition, to give participants some context, lectures were held to introduce weather and climate simulations, ML methods (especially focusing on large scales), and HPC. Guess what I presented? Right! HPC!

As I’ve never had the opportunity to introduce the general field of HPC (I’m usually doing just the GPU stuff), I needed to create a presentation from scratch. It was quite some work, but I’m really happy with the result. There is much more to teach about HPC, but one can only do so much in 60 minutes.

As a hook, I tried using a definition of HPC I came up with: High Performance Computing is computing with a powerful machine using the available resources efficiently. It might be a little contrived for this talk at hand, but I wanted to focus both on the powerful machines themselves and using them efficiently. The latter part is sometimes forgotten, but ever so important, especially in times of sky-rocketing energy prizes. The slides start by first comparing personal computers with HPC computers, getting interactive feedback from the audience on the way and assessing their experience with HPC. Then, I focus on a few historical important supercomputers, making the way to our JSC machines and finally to Frontier. The latter I use as an example to explain a little about GPUs. To focus on the software-side of things (using resources efficiently), I came up with a weird, inverted pyramid of resource utilization: 1) exploit all capabilities of a processing entity, 2) parallelize, 3) distribute. For each point, the slides show an example on how to achieve it and important technologies involved.

Just as usual, I made the slides with LaTeX Beamer; which I particularly enjoy when I’m able to use \foreach to create little boxes and repeating graphics – and there are plenty of these ones in this deck. TikZ is an amazing package which I use more and more of¹, to the detriment typesetting durations… overlay, remember picture is basically in my muscle memory by now. For a first time, I also used tikzexternalize to save the diagram of a HPC node to a file and re-use it afterwards; LaTeX wouldn’t want to generate it 96 times (boooh), so I inserted a hidden slide before, generated the image with tikzexternalize there, and then re-used it with an \includegraphics 96 times – with \foreach, of course.

Find the slides embedded below² and in referable form as hdl.handle.net/2128/32001 at our library.

Download slides.

It makes placing things free-floating on a slide so much easier. ↩
This is actually a minified version of the slides, using low-res versions of the images; add this to your shell function minify-pdf () {in="$1"; out="${1%.*}"; gs -sDEVICE=pdfwrite -dCompatibilityLevel=1.4 -dPDFSETTINGS=/prepress -dNOPAUSE -dQUIET -dBATCH -sOutputFile="$out--minified.pdf" $in;}! ↩