Run Ollama using podman with amdgpu on Ubuntu 24.04

September 22, 2024
linux
llm
technical notes

I have some freetime this afternoon so I decided that I will try running Ollama on podman. Ollama provides a docker image on docker hub at https://hub.docker.com/r/ollama/ollama, so I launched a container following the manual in the docker hub.

podman run -d -v ollama:/root/.ollama -p 11434:11434 \
    --name ollama ollama/ollama

My laptop (A ThinkPad T14 Gen 3 AMD) has a AMD GPU so I rerun the ollama with GPU supports.

podman run -d --device /dev/kfd --device /dev/dri \
    -v ollama:/root/.ollama -p 11434:11434 --name ollama ollama/ollama:rocm

However, Log messages output by podman logs ollama shows that ollama cannot access /dev/kfd and /dev/dri devices. Google suggests the issue is due to my local account does not have permission to access /dev/kfd [1][2]. So I added my account to render group and restarted my machine. After the restart, my local has access to the /dev/kfd as show by the output of rocminfo.

sudo usermod -aG render $USER
sudo reboot
sudo apt install rocminfo
rocminfo

As I am running podman as a non-prvileged container, container’s root will be mapped to my current $USER account. /dev/kdf and /dev/dri will be mapped to nobody/nogroup and that should be fine as long as my root is also in that group.

podman exec ollama ls -al /dev
total 4
drwxr-xr-x 6 root   root       380 Sep 22 08:30 .
dr-xr-xr-x 1 root   root      4096 Sep 22 08:30 ..
lrwxrwxrwx 1 root   root        11 Sep 22 08:30 core -> /proc/kcore
drwxr-xr-x 2 root   root        80 Sep 22 08:30 dri
lrwxrwxrwx 1 root   root        13 Sep 22 08:30 fd -> /proc/self/fd
crw-rw-rw- 1 nobody nogroup   1, 7 Sep 22 07:50 full
crw-rw---- 1 nobody nogroup 234, 0 Sep 22 07:50 kfd

I rerun the podman container with the following command:

podman run --pull newer --detach -e OLLAMA_DEBUG=1 \
    --security-opt label=type:container_runtime_t --replace --group-add keep-groups \
    --device /dev/kfd --device /dev/dri -v ollama:/root/.ollama \
    -p 11434:11434 --name ollama ollama/ollama:rocm

ollama starts and from the output logs, I can see it is able to detect my GPU however ollama is saying it cannot utilize it.

time=2024-09-22T08:46:46.238Z level=DEBUG source=amd_linux.go:218 msg="mapping amdgpu to drm sysfs nodes" amdgpu=/sys/class/kfd/kfd/topology/nodes/1/properties vendor=4098 device=5761 unique_id=0
time=2024-09-22T08:46:46.238Z level=DEBUG source=amd_linux.go:252 msg=matched amdgpu=/sys/class/kfd/kfd/topology/nodes/1/properties drm=/sys/class/drm/card1/device
time=2024-09-22T08:46:46.238Z level=DEBUG source=amd_linux.go:284 msg="amdgpu memory" gpu=0 total="1.0 GiB"
time=2024-09-22T08:46:46.238Z level=DEBUG source=amd_linux.go:285 msg="amdgpu memory" gpu=0 available="531.6 MiB"
time=2024-09-22T08:46:46.238Z level=DEBUG source=amd_common.go:18 msg="evaluating potential rocm lib dir /usr/lib/ollama"
time=2024-09-22T08:46:46.238Z level=DEBUG source=amd_common.go:61 msg="detected ROCM next to ollama executable /usr/lib/ollama"
time=2024-09-22T08:46:46.242Z level=DEBUG source=amd_linux.go:337 msg="rocm supported GPUs" types="[gfx1030 gfx1100 gfx1101 gfx1102 gfx900 gfx906 gfx908 gfx90a gfx940 gfx941 gfx942]"
time=2024-09-22T08:46:46.242Z level=WARN source=amd_linux.go:341 msg="amdgpu is not supported" gpu=0 gpu_type=gfx1035 library=/usr/lib/ollama supported_types="[gfx1030 gfx1100 gfx1101 gfx1102 gfx900 gfx906 gfx908 gfx90a gfx940 gfx941 gfx942]"
time=2024-09-22T08:46:46.242Z level=WARN source=amd_linux.go:343 msg="See https://github.com/ollama/ollama/blob/main/docs/gpu.md#overrides for HSA_OVERRIDE_GFX_VERSION usage"

So it seems the ROCM library supports many GPU types [gfx1030 gfx1100 gfx1101 gfx1102 gfx900 gfx906 gfx908 gfx90a gfx940 gfx941 gfx942] and my GPU is not in the list. My GPU is gfx1035. It seems that we can force the library to use a similar target by setting the environment variable to HSA_OVERRIDE_GFX_VERSION="10.3.0". Here the version string “10.3.0” is set because the nearest supported version to my gpu is “gfx1030”. I relaunched my ollama container adding the above variable.

podman run --pull newer --detach -e OLLAMA_DEBUG=1 -eHSA_OVERRIDE_GFX_VERSION="10.3.0" \
    --security-opt label=type:container_runtime_t --replace --group-add keep-groups \
    --device /dev/kfd --device /dev/dri -v ollama:/root/.ollama \
    -p 11434:11434 --name ollama ollama/ollama:rocm

podman logs ollama
time=2024-09-22T08:53:32.277Z level=DEBUG source=amd_linux.go:284 msg="amdgpu memory" gpu=0 total="1.0 GiB"
time=2024-09-22T08:53:32.277Z level=DEBUG source=amd_linux.go:285 msg="amdgpu memory" gpu=0 available="492.1 MiB"
time=2024-09-22T08:53:32.277Z level=DEBUG source=amd_common.go:18 msg="evaluating potential rocm lib dir /usr/lib/ollama"
time=2024-09-22T08:53:32.277Z level=DEBUG source=amd_common.go:61 msg="detected ROCM next to ollama executable /usr/lib/ollama"
time=2024-09-22T08:53:32.277Z level=INFO source=amd_linux.go:349 msg="skipping rocm gfx compatibility check" HSA_OVERRIDE_GFX_VERSION=10.3.0
time=2024-09-22T08:53:32.277Z level=INFO source=types.go:107 msg="inference compute" id=0 library=rocm variant="" compute=gfx1035 driver=0.0 name=1002:1681 total="1.0 GiB" available="492.1 MiB"

The output shows my GPU is being used. What a pity that the hardware is there but the driver provided by the hardware company does not support…

Run Ollama using podman with amdgpu on Ubuntu 24.04

Reference