On Tue, May 15, 2018 at 9:39 PM, Thomas Martitz <kugel@xxxxxxxxxxx> wrote: > Am 15.05.2018 um 17:10 schrieb Rafael J. Wysocki: > >> Before applying anything like this I need to understand the failure in >> the first place. >> >> Since direct_complete is optional anyway and it should be safe to call >> pm_runtime_suspended() at any time, I'm suspecting a bug in a driver >> that is exposed by the commit turned up by bisection. >> >> Where's the code I need to look at? >> > > Hello, > > thanks for looking into this. To answer the question I need your expertise. > I couldn't find out which device causes it. Since the freeze only happens of > amdgpu is loaded I naturally thought the problematic device(s) would be > bound to amdgpu. So I assumed the amdgpu-bound devices would be affected by > your change, i.e. not have direct_complete set anymore. But I could not > confirm this evidence during testing (I basically restricted my patch to > devices which pass !strcmp(dev->driver->name, "amdgpu", and the system was > still frozen upon resume). > > Then I printed a list which pass "dev->direct_complete && > !pm_runtime_suspended(dev)" (with this patch applied), and the list was > *very* long (maybe 100+). I can perform this again if you find the list > useful. > > The sheer number of devices which pass "dev->direct_complete && > !pm_runtime_suspended(dev)" made me think that your change should be > re-considered, hence my patch. > > So, please give me advice on how I can point you to the code that you'd like > to look at. I assume key information which device(s) pass > "dev->direct_complete && !pm_runtime_suspended(dev)" AND cause the freeze on > !dev->direct_complete? I'm afraid that will take a very long time to find > out. Let's continue this in the Bugzilla entry at https://bugzilla.kernel.org/show_bug.cgi?id=199693 Thanks, Rafael