On Thu, Nov 29, 2018 at 11:53:53AM +0100, Karol Herbst wrote: > On Thu, Nov 29, 2018 at 2:29 AM Michael S. Tsirkin <mst@xxxxxxxxxx> wrote: > > > > On Thu, Nov 29, 2018 at 12:21:31AM +0100, Karol Herbst wrote: > > > this was already debugged and there is no point in searching inside > > > the Firmware. It's not a firmware bug or anything. > > > > > > The proper fix is to do something inside Nouveau so that we don't > > > upset the device and being able to runtime resume it again. > > > > > > The initial thing we do inside Nouveau to cause those issues is to run > > > that so called "DEVINIT" script inside the vbios to initialize the > > > GPU, problem is, it changes something on the PCIe configuration so > > > that the GPU isn't able to runtime resume anymore. I am in contact > > > with Nvidia about that issue and hopefully we get the proper answers. > > > When I was digging into that myself I was able to make the situation > > > more stable by setting the PCIE link speed to the boot defaults, but > > > that was still pretty unstable. > > > > > > Anyway, because the binary driver fails here as well (through > > > bumblebee and so on) there isn't much of reverse engineering we can do > > > besides guessing and trying it on literally every hardware until it > > > works. > > > > > > We also have an upstream bug for this issue: > > > https://bugzilla.kernel.org/show_bug.cgi?id=156341 > > > > If you like I can probably dump the pcie registers on card > > and/or the pcie port under windows. The card works there :) > > Let me know. > > > > -- > > MST > > the problem is, we would need to know the registers right before > suspending the GPU. If someone would be able to trace all PCIe > register read and writes for the entire suspending/resume process, > that would be very helpful. Well I can pass the card to a VM, and trace it on the hypervisor, that isn't a problem. A tricky thing is the ACPI tables, would need to somehow know which ones are relevant to pass them to guest ... ideas on that? -- MST