On Wed, Sep 16, 2020 at 3:04 AM Christoph Hellwig <hch@xxxxxx> wrote: > > On Tue, Sep 15, 2020 at 02:46:07PM -0400, Alex Deucher wrote: > > This change breaks tons of systems. > > Did you do at least some basic root causing on why? Do GPUs get > fed address they can't deal with? Any examples? > > Bug 1 doesn't seem to contain any analysis and was reported against > a very old kernel that had all kind of fixes since. > > Bug 2 seems to imply a drm kthread is accessing some structure it > shouldn't, which would imply a mismatch between pools used by radeon > now and those actually provided by the core. Something that should > be pretty to trivial to fix for someone understanding the whole ttm > pool maze. > > Bug 3: same as 1, but an even older kernel. > > Bug 4: looks like 1 and 3, and actually verified to work properly > in 5.9-rc. Did you try to get the other reporters test this as well? It would appear that the change in 5.9 to disable AGP on radeon fixed the issue. I'm following up on the other tickets to see if I can get confirmation. On another thread[1], the user was able to avoid the issue by disabling HIMEM. Looks like some issue with HIMEM and/or AGP. Alex [1] https://lkml.org/lkml/2019/12/14/263