Resources

This page is meant to bridge the gap between your skills and those required to thrive in the course. At its core, it’s a glorified collection of links; the most helpful way to use it is to find what you’re confused about and check out all the links in the relevant paragraphs.

The course relies heavily on Victor Eijkhout’s Art of HPC (especially volume 4), which goes into greater depth than we do in this class–if the content here isn’t enough for you, that’s a great place to look for more.

The Supercomputer

Software on the supercomputer is generally accessed with modules.

Jobs are submitted via Slurm.

“Getting Started” playlist on BYUSupercomputing’s YouTube channel.

An overview of super computer architecture.

The Project

From phase 2 onward, you’ll want reference input and output files to test your work; you can find them in wavefiles.tar.gz. They can be downloaded and extracted with:

wget https://byuhpc.github.io/sci-comp-course/project/wavefiles.tar.gz
tar xf wavefiles.tar.gz

Within the resultant wavefiles/bin directory are three helpful binaries: wavesolve, waveshow, and wavediff. wavesolve is a reference solver that works with up to 8 dimensions. waveshow prints wave orthotope files in human-readable form. wavediff checks whether two wave orthotope files represent the same wave orthotope. Call each with --help as the only argument to learn how to use them.

You’ll also find many wave orthotope files from 1 to 8 dimensions (in case you’re doing the extra credit), stored in the directories 1D, 2D, etc. Files with “in” in their names are input files, initialized and with simulation time zero; the corresponding “out” files are correct output files.

As an example, 2d-small-out.wo is the same wave orthotope as the one represented by 2d-small-in.wo, but after solving. You could test your implementation using these two files by running your solver on the input file:

./wavesolve_serial wavefiles/2D/2d-small-in.wo my-2d-small-out.wo

…then using the included wavediff binary to ensure that your output file is correct:

wavefiles/bin/wavediff wavefiles/2D/2d-small-out.wo my-2d-small-out.wo

You could also use WaveSim.jl to ensure that your output file is correct:

using WaveSim
correct = WaveOrthotope(wavefiles(2, :small, :in))
mine = WaveOrthotope("my-2d-small-out.wo")
@assert isapprox(correct, mine)

Using WaveSim.jl allows you to look at wave orthotopes interactively.

Programming

You’re expected to come into the class with either some C++ experience or the ability to pick up languages quickly, so we don’t teach programming in general or C++ specifically.

Unless you have an established workflow for programming on the supercomputer, we strongly recommend setting up VS Code for remote editing. You’ll find the C++ and Julia extensions helpful. The enlightened will love the Vim extension.

Since this is a class about high performance computing you’ll do some optimization. An efficient algorithm is of course vital for speed, but data locality is critical and not talked about enough–for example, linked lists are theoretically faster than arrays for some algorithms, but the hardware realities mean they’re almost never actually the right choice. Vector instructions enable significant speedups now that clock speeds have plateaued.

Keep in mind that C++ indexes from 0 and Julia (mostly) indexes from 1, although you usually shouldn’t need to worry about that in Julia.

C++

C++ is more expressive than C while retaining most of its benefits, but it is complex and carries a lot of baggage from its long history. It’s easy enough to get a grasp of the basics, but mastery and learning good style require more time and effort than most languages you’re likely to encounter. Bjarne Stroustrup’s “The C++ Programming Language” is the definitive guide to using C++, but it’s probably overkill to buy it just for this class when acceptable online tutorials like the one offered by W3 Schools are available. Most online tutorials (including the W3 Schools one), though, should be taken with a grain of salt–they often encourage bad style, so it’s best to be guided by our example code and expert opinion when possible.

Compilation

C++ is a compiled language–before the code can run, the human-readable code (a text file) must be translated to an executable (a binary machine code file). This is accomplished with a compiler like GCC or LLVM. If you don’t yet have access to the supercomputer and don’t have a compiler installed locally, you can try OnlineGDB’s C++ compiler; to use C++20, select it from the dropdown at the upper right. On the supercomputer, writing, compiling, and running a hello world program in C++ looks something like:

cat > hello.cpp << EOF
#include <iostream>
int main() {
	std::cout << "Hello, world!" << std::endl;
}
EOF
module load gcc/14.1
g++ -g -std=c++20 -o hello hello.cpp
./hello

Typical Knowledge Gaps

Many students have been taught to use C++ like C, so they aren’t familiar with any of the features that make C++ more bearable to use for non-trivial projects. In particular:

They don’t use (and in some cases haven’t even heard of) auto as liberally as they should.
They aren’t familiar with templates, which are vital for writing flexible code. In particular, most of the extra credit for the project is extremely challenging without template metaprogramming.
They’ve never used closures, which are essential for the more functional programming C++ is evolving toward.
They fail to use standard algorithms when they should, resulting in excess boilerplate and duplication of effort.

One aspect of C++ that trips up many students is fact that ‘&’ is used for so many things: to obtain pointers, declare and pass arguments by reference, perform logical and bitwise AND operations, and more. It’s in the running for the most cursed character in any programming language.

Debugging, Profiling, and Optimization

GDB is ubiquitous for debugging C++ programs. If you prefer a graphical debugger, you can integrate GDB into VS Code. If you do so, you’ll probably want to modify tasks.json by changing command to the result of module load gcc/14.1 && which g++ and adding -std=c++20 to args. Valgrind is essential for tracking down memory problems. For debugging MPI applications, I haven’t found anything better than using tmpi with gdb (module load reptyr tmpi), although I hear that MPI debugging is possible within VS Code now.

There are many tools available for profiling in C++; perf is a good, simple choice in combination with Valgrind. Profiling and optimization in C++ are hard–this class will be the start of a long journey.

Catch2 is a ubiquitous C++ testing framework that integrates with CMake. module load catch2 makes it available on the supercomputer.

Julia

Julia is like Python on steroids–it’s more expressive, faster, was designed from the ground up for HPC, and has a powerful REPL. It has its quirks and isn’t the right tool for everything, but it’s probably the best widely-used language available to write programs for supercomputers. The example code is a good start for the Julia phases of the project.

Revise makes development in Julia (especially packages) a breeze. Pluto notebooks are nice if you’re used to a Jupyter-like interface.

Profiling, optimization, and debugging (both in VS Code and in the REPL) in Julia is much easier than in C++. ProfileView and BenchmarkTools in particular are very helpful for performance. Unit testing is easy, especially for packages.