Question regarding to ROCM and removable GPUs

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi everyone,

I have a Framework 16 laptop with dGPU extension, as you may know, that means it uses Radeon 780M iGPU and Radeon RX 7700S dGPU.

Something I noticed, after I virtually remove RX 7700S for VM applications (vfio-pci), it breaks the entire ROCM feature... making ROCM apps to not detect anything, not even iGPU. Re-attaching 7700S doesn't resolve the problem.

> $ rocminfo 
> ROCk module is loaded
> Unable to open /dev/kfd read-write: Invalid argument
> waltercool is member of render group

When using LMStudio for example, by default both ROCM and OpenCL backends work fine, if I remove my GPU (and later re-attach to host), only OpenCL will work.

Applications like ollama or ComfyUI will fail after dGPU is detached, using "amdgpu_gpu_recover" does not resolve the issue.

Any ideas how to recover KFD/ROCM functionality after I detach my GPU?

Q: How do you detach your GPU?
A: /sys/bus/pci/devices/${GPU_VIDEO/GPU_AUDIO}/driver/unbind or /sys/module/amdgpu/drivers/pci\:amdgpu/unbind

Q: How do you reattach your GPU?
A: Remove device (/sys/bus/pci/devices/${GPU_VIDEO/GPU_AUDIO}/remove), then PCI rescan.

Kind regards.

--
WalterCool

Sent with Proton Mail secure email.




[Index of Archives]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux