A question about running LLMs with an AMD card

Gunpachi@lemmings.world · 10 months ago

A question about running LLMs with an AMD card

taladar@sh.itjust.works · 10 months ago

To do general purpose GPU calculations on AMD hardware you need a GPU that is supported by ROCm (AMD’s equivalent to CUDA). Most of the gaming GPUs are not.

There is a list here but be aware that that is for the latest rocm version, some tools might still use older versions with different supported devices.

https://rocm.docs.amd.com/projects/install-on-linux/en/latest/reference/system-requirements.html#supported-gpus

madnificent@lemmy.world · 10 months ago

Has that changed recently? I’ve ran ROCm successfully on an RX6800. I seem to recall that was supported, the host OS (Arch) was not.

taladar@sh.itjust.works · 10 months ago

When I tried it maybe a year or so ago there were four supported chipset in that version (5.4.2 I think) of rocm but I don’t remember which card models those were since they were only specified in that internal chip name. Mine wasn’t supported at the time (5700XT)

turbodrooler@lemmy.world · 10 months ago

No, GFX1030 is still supported.

turbodrooler@lemmy.world · 10 months ago

This link is misleading. For example, the Radeon RX6800 IS supported because it is the same chip as one of the Radeon Pros. GFX1030. Many others are too…though support does not go very far back.

exu@feditown.com · 10 months ago

Llama.cpp supports OpenCL as well and performs better than rocm in my limited experience. That should work on basically any GPU.

madnificent@lemmy.world · 10 months ago

Latest ollama has support for AMD GPUs. I had to compile from source to make it pick up the GPU on my system.

turbodrooler@lemmy.world · 10 months ago

Look into llamafile. It makes things so easy.

Falcon@lemmy.world · 9 months ago

Basically, RoCM and CUDA allows one to do math on the GPU. Most Linear Algebra operations (i.e. LLM or NNs and ML generally) can be parallelized over a GPU which is much more performant than CPU.

To perform calculations on GPU, one needs some sort of interface to to their programming language of choice, NVIDIA has CUDA which is in CPP with bindings to python: (pytorch, Tensorflow etc. ), Julia: Flux etc.

RoCM is AMDs solution, there bindings are young and not widely implemented.

My advice, play around with Flux RoCM and PyTorch RoCM just to get an idea. Suffice it to say, when I started doing RL and LLMs more seriously I gave up my colab and sold my AMDs to fund a 3060.

turkishdelight@lemmy.ml · 8 months ago

llama.cpp (and ollama) has AMD support through ROCm and also now Vulkan.

moreeni@lemm.ee · 10 months ago

Local LLMs usually run using CPU only

rodbiren@midwest.social · 10 months ago

You absolutely can use GPU. Try the auto installer and select AMD when the option shows up. The documentation should have more info.

https://github.com/oobabooga/text-generation-webui

moreeni@lemm.ee · 10 months ago

Interesting. Thanks for sharing!

Gunpachi@lemmings.world · 10 months ago

I want to use software like stable diffusion and oogabooga. Don’t they utilize GPU ?

moreeni@lemm.ee · 10 months ago

Ooga Booga is just a webui for local text gen LLMs, that usually only utilise CPU

SD, on the other hand, does utilise GPU. I haven’t tried it but I don’t see why it won’t run with the open source AMD drivers, they are good. Just try it yourself, everything is free and you would only lose some time in the worst case scenario

sapetoku@sh.itjust.works · 10 months ago

All the ones I’ve used so far are able to use the GPU but it has to be enabled in the app settings. I mostly use LM Studio and it flies on my nvidia 3060. Doesn’t seem to have options for AMD GPUs though, unless I’m mistaken.