On Wed, Sep 16, 2020 at 06:16:25PM -0400, Alex Deucher wrote: > On Wed, Sep 16, 2020 at 3:04 AM Christoph Hellwig <hch@xxxxxx> wrote: > > > > On Tue, Sep 15, 2020 at 02:46:07PM -0400, Alex Deucher wrote: > > > This change breaks tons of systems. > > > > Did you do at least some basic root causing on why? Do GPUs get > > fed address they can't deal with? Any examples? > > > > Bug 1 doesn't seem to contain any analysis and was reported against > > a very old kernel that had all kind of fixes since. > > > > Bug 2 seems to imply a drm kthread is accessing some structure it > > shouldn't, which would imply a mismatch between pools used by radeon > > now and those actually provided by the core. Something that should > > be pretty to trivial to fix for someone understanding the whole ttm > > pool maze. > > > > Bug 3: same as 1, but an even older kernel. > > > > Bug 4: looks like 1 and 3, and actually verified to work properly > > in 5.9-rc. Did you try to get the other reporters test this as well? > > It would appear that the change in 5.9 to disable AGP on radeon fixed > the issue. I'm following up on the other tickets to see if I can get > confirmation. On another thread[1], the user was able to avoid the > issue by disabling HIMEM. Looks like some issue with HIMEM and/or > AGP. Thanks. I'll try to spend some time to figure out what could be highmem related. I'd much rather get this fixed properly.