Madgwick.xyz

January 12, 2021 (2021-01-12)

Compiling a kernel to use ROCm without DKMS on Debian

At the time of writing, installation of AMD’s ROCm1 generally includes a ROCm specific version of the ‘amdgpu’ kernel module (installed via dkms) alongside userland utilities. For this to install and work correctly a supported OS (and thus kernel version) needs to be used.

This is fine if you’re running Ubuntu LTS or RHEL with the default kernel. If you are running something more exotic it likely won’t work. I used to run Ubuntu but now prefer to use Debian, which ROCm doesn’t support. Ubuntu is based on Debian so getting things working is easy enough.

Using ROCm with an ‘upsteam’ kernel

The ROCm documentation mentions that so called ‘upstream’ kernels can be used instead of installing the ROCm bundled dkms module2. So why the quote marks? It’s because a while ago LTS Ubuntu used to run a kernel which didn’t come with a new enough version of amdgpu to support ROCm well, but kernels newer than this did, and so these were ‘upstream’ of Ubuntu’s LTS kernel. This is no longer the case with Ubuntu 20.04, making the use of ‘upstream’ mostly irrelevant. On Ubuntu the dkms module is still required because the amdgpu module (and kernel config) included by default doesn’t contain the ‘HSA kernel driver’ (aka ‘amdkfd’) that ROCm requires.

So starting with a newish kernel (I used 5.4.86) how do we ROCm working? The documentation simply states to install the userland utilities and set a udev rule. There’s one step missing. First you must ensure the kernel you’re using has a version of amdgpu compiled with the ‘HSA kernel driver’3 enabled. On Debian the default kernel configs have this disabled. It can be enabled by navigating to and enabling ‘Device Drivers > Graphics support > HSA kernel driver for AMD GPU devices’ when using make menuconfig.

Quick steps for upgrading from vanilla Debian kernel to newer kernel and enabling amdgpu HSA driver

Once running a kernel with the HSA kernel driver compiled into amdgpu, you can use the ROCm userland. If you’re using Debian it’s worth noting that on ‘Buster’ I wasn’t able to install the ‘rom-dev’ package set (which is intended for Ubuntu) because one of the less important packages depended on a package missing from Debian. I was instead able to install ‘comgr hip-base hip-rocclr hsakmt-roct llvm-amdgpu rocm-cmake rocm-device-libs rocm-smi-lib64 rocm-smi rocm-utils hsa-rocr-dev hsakmt-roct-dev’ which got me everything I needed to compile and run HIP code. I also needed to export LD_LIBRARY_PATH=/opt/rocm-4.0.0/opencl/lib/ to fix a problem with OpenCL programs being unable to find libraries. This is in addition to adding /opt/rocm-4.0.0/bin/ to PATH.

References