[PATCH] drm/radeon: deprecate and remove KFD interface

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Nov 29, 2017 at 4:54 PM, Michel Dänzer <michel at daenzer.net> wrote:
> On 2017-11-29 03:40 PM, Oded Gabbay wrote:
>> On Wed, Nov 29, 2017 at 2:31 PM, Oded Gabbay <oded.gabbay at gmail.com> wrote:
>>> On Wed, Nov 29, 2017 at 1:16 PM, Michel Dänzer <michel at daenzer.net> wrote:
>>>> On 2017-11-01 09:31 AM, Oded Gabbay wrote:
>>>>> ok, taken to -next.
>>>>
>>>> This change broke the radeon driver on my Kaveri laptop. The gdm login
>>>> screen works, but logging into the GNOME on Xorg session quickly results
>>>> in a GPU hang and associated badness, see the attached dmesg.
>>>>
>>>> Reverting this change on top of drm-next makes it work again.
>>>>
>>>> On a hunch, I've tried reverting commits 62a7b7fbd08e ("drm/radeon:
>>>> reduce number of free VMIDs and pipes in KV") and 28b57b856b63
>>>> ("drm/radeon/cik: Don't touch int of pipes 1-7"), but no luck.
>>>>
>>>> Any ideas for what else is missing?
>>>>
>>>> Note that the amdkfd driver isn't actually active anyway, because I'm
>>>> disabling the IOMMU. Is it possible that it's still doing or triggering
>>>> some needed HW setup before it bails in that case?
>>>>
>>>>
>>>> P.S. Assuming we can fix this without reverting, maybe we could also
>>>> remove rdev->grbm_idx_mutex again?
>>>>
>>>> --
>>>> Earthling Michel Dänzer               |               http://www.amd.com
>>>> Libre software enthusiast             |             Mesa and X developer
>>>
>>> Hi Michel,
>>> Even without IOMMU, amdkfd will initialize the module and internal
>>> structures per device, up to the point where it tries to register a
>>> callback with the iommu driver.
>>> If IOMMU is disabled, it will fail then with the following error
>>> message (in dmesg): "error getting iommu info. is the iommu enabled?"
>>>
>>> Having said that, it doesn't initialize anything in the device H/W
>>> itself, so I find this very weird.
>>>
>>> I looked at the patch itself again and I don't see anything suspicious.
>>>
>>> I'll try to resurrect my Kaveri machine to check this, but it will
>>> take some time.
>>>
>>> Oded
>>
>> Any chance that the increase of VMIDs from 8 to 16 somehow (although I
>> don't know how) caused this problem ?
>> The desktop gui also didn't work for me, but when I changed the VMID
>> number back to 8 (in cik.c) the gui worked again.
>>
>> Michel, could you try this as well ?
>
> Yeah, that also occurred to me in the meantime, and I can confirm your
> findings.
>
> My guess right now is that it's related to cik_pcie_init_compute_vmid.
>
>
> --
> Earthling Michel Dänzer               |               http://www.amd.com
> Libre software enthusiast             |             Mesa and X developer

Yeah, that seems reasonable.
That initialization was part of kfd but then we moved it to radeon.
Now it collides with radeon's initialization.
I removed it completely and returned the number of VMIDs to 16 and the
GUI is working.
I'll send a patch.

Oded


[Index of Archives]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux