Re: [PATCH] pci/quirks: Add quirk to reset nvgpu at boot for the Lenovo ThinkPad P50

Lyude Paul <lyude@xxxxxxxxxx> · Tue, 19 Mar 2019 16:56:03 -0400

Bump, sorry to bug you but is there any update on this or any information you
still need from me which would help get this upstream?

On Wed, 2019-03-13 at 18:25 -0400, Lyude Paul wrote:
> [note to David Ober: you -should- be able to reply to this, hopefully, but I
> haven't actually tested that so results may vary]
> 
> Hi again! Sorry I didn't fully answer all of the questions you originally
> asked in this email, I had to get in contact with Lenovo to make sure that
> it
> was OK for me to disclose more details on this bug (and I had PTO scheduled
> immediately after I asked). I've added David Ober from Lenovo to this thread
> as well. So now that I've got Lenovo's approval I can answer those
> questions,
> and give some better answers for the others! (see below)
> 
> 
> On Fri, 2019-02-15 at 16:17 -0500, Lyude Paul wrote:
> > On Thu, 2019-02-14 at 18:43 -0600, Bjorn Helgaas wrote:
> > > Hi Lyude,
> > > 
> > > On Tue, Feb 12, 2019 at 05:02:30PM -0500, Lyude Paul wrote:
> > > > On a very specific subset of ThinkPad P50 SKUs, particularly ones that
> > > > come with a Quadro M1000M chip instead of the M2000M variant, the BIOS
> > > > seems to have a very nasty habit of not always resetting the secondary
> > > > Nvidia GPU between full reboots if the laptop is configured in Hybrid
> > > > Graphics mode. The reason for this happening is unknown, but the
> > > > following steps and possibly a good bit of patience will reproduce the
> > > > issue:
> > > > 
> > > > 1. Boot up the laptop normally in Hybrid graphics mode
> > > > 2. Make sure nouveau is loaded and that the GPU is awake
> > > > 2. Allow the nvidia GPU to runtime suspend itself after being idle
> > > > 3. Reboot the machine, the more sudden the better (e.g sysrq-b may
> > > > help)
> > > > 4. If nouveau loads up properly, reboot the machine again and go back
> > > > to
> > > > step 2 until you reproduce the issue
> > > > 
> > > > This results in some very strange behavior: the GPU will
> > > > quite literally be left in exactly the same state it was in when the
> > > > previously booted kernel started the reboot. This has all sorts of bad
> > > > sideaffects: for starters, this completely breaks nouveau starting
> > > > with
> > > > a
> > > > mysterious EVO channel failure that happens well before we've actually
> > > > used the EVO channel for anything:
> > > 
> > > There are a lot of moving parts here that are probably obvious to you
> > > but not to me.  I need help untangling this a bit so I'm comfortable
> > > that we got to the root cause and that we're doing something logical
> > > as opposed to something that just happens to make things work.  I
> > > really don't know enough to even ask the right questions...
> > 
> > I completely understand! I'm pretty sure I'd be just as skeptical if I was
> > in
> > your position reviewing a patch like this :P
> > 
> > > Is there a bug report for this?  Bugzilla.kernel.org would be ideal,
> > > including "lspci -vvxxx" and dmidecode for the system.
> > > 
> > Not yet, but there has been discussion about this between nouveau
> > developers
> > on our IRC channel.
> I lied: yes there actually is a bug report for this, but it's currently on
> the
> Red Hat bugzilla. I can get more information from it if you need (with
> lenovo's approval of course).
> 
> > > Is this running a current BIOS?  The date in your log below looks
> > > pretty recent, so I assume it is current.
> > 
> > Yes, this is the most up to date BIOS available for this system.
> 
> And additionally: I've been working with Lenovo on this issue for a couple
> of
> months now, and we've gone through dozens of different trial BIOSes with no
> success thus far. However, Lenovo is currently working on trying to add this
> workaround into their BIOS but I've been told that this change is going to
> take a decent amount of time since they need to test it across multiple
> operating systems. I'd be happy to come back and add a conditional later to
> turn this workaround off for later BIOS versions once Lenovo has released a
> proper fix.
> 
> With all of that being said, [how] do you think we should proceed?
> 
> > > I assume "hybrid graphics" means you have two GPUs.  Do you select
> > > hybrid graphics mode in the BIOS?
> > 
> > Yes, the P50 has two available modes in the BIOS: Dedicated (e.g. only
> > the nvidia GPU is used for everything), and Hybrid (i915 drives the
> > built-in display panel, nouveau drives everything else). This bug only
> > seems to occur in Hybrid mode.
> > 
> > > I assume when you say the Nvidia GPU doesn't get reset on a full
> > > reboot, you're talking about a "warm reboot", and that if you actually
> > > remove the power and do a cold reboot, there's no problem?
> > 
> > If you meant "unplugging the power adapter" when you said cutting the
> > power we don't need to go that far, but shutting down the machine and
> > restarting it by hand does avoid the problem yes.
> > 
> > > I assume Nvidia GPU being active means you are using the performance
> > > GPU.  Does that mean the integrated GPU is completely unused and Linux
> > > does nothing at all with it?  Is Linux doing any switching between
> > > them?  If so, how?  I am not 100% confident in the code I've seen that
> > > does the switching.
> > 
> > "Switching" isn't really the right word to describe it these days as the
> > process for how this is handled has changed quite a bit in the last few
> > years. As I mentioned above, the main intel GPU is used for driving the
> > built-in display. As for the nvidia GPU, it's used to drive any kind of
> > external displays connected to the P50 and can also be used to do
> > rendering through DRI PRIME. The connector hookups are mostly hardcoded
> > as there isn't really much of a mux, unlike the old days where we could
> > actually do things like switch which GPU was driving the internal
> > display on the fly with  vga-switcheroo using the _DSM methods provided
> > by ACPI.
> > 
> > Nowadays muxless laptops like the ThinkPad P50 don't make much use of
> > _DSM for power management and instead just runtime suspend the dedicated
> > nvidia GPU when it's not in use. We rely on the PCI subsystem to invoke
> > the _PR3 ACPI methods for suspending and resuming the GPU and don't
> > handle much of it ourselves anymore:
> > 
> > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/gpu/drm/nouveau/nouveau_drm.c?h=v5.0-rc6#n860
> > 
> > (You can ignore the call to nouveau_switcheroo_optimus_dsm() there; it
> >  doesn't do anything on systems like the P50 where we detect _PR3)
> > 
> > Additionally-if you see any problems with how we're handling things
> > there we're all ears! We've been trying to fix as many issues with
> > runtime PM as we can find.
> > 
> > Something else to consider here too: I've only ever managed to reproduce
> > this issue on this specific P50 SKU. Since I work on Desktop engineering
> > at Red Hat we have access to a very large number of laptops with hybrid
> > GPU setups. Interestingly I haven't ever reproduced anything like this
> > on any of them, even P50 SKUs with the M2000M GPU instead of the M1000M
> > (which are almost identical, they're both GM107 chips) I can follow the
> > same steps that I followed to reproduce this bug and it never appears.
> > 
> > Another thing to note: this bug also doesn't appear to happen 100% of
> > the time. Some warm reboots the GPU will be reset by the BIOS perfectly
> > fine.
> > 
> > > > nouveau 0000:01:00.0: disp: chid 0 mthd 0000 data 00000400 00001000
> > > > 00000002
> > > > 
> > > > Later on, this causes us to timeout trying to bring up the GR ctx:
> > > > 
> > > > ------------[ cut here ]------------
> > > > nouveau 0000:01:00.0: timeout
> > > > WARNING: CPU: 0 PID: 12 at
> > > > drivers/gpu/drm/nouveau/nvkm/engine/gr/ctxgf100.c:1547
> > > > gf100_grctx_generate+0x7b2/0x850 [nouveau]
> > > > Modules linked in: nouveau mxm_wmi i915 crc32c_intel ttm i2c_algo_bit
> > > > serio_raw drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops
> > > > xhci_pci drm xhci_hcd i2c_core wmi video
> > > > CPU: 0 PID: 12 Comm: kworker/0:1 Not tainted 5.0.0-rc5Lyude-Test+ #29
> > > > Hardware name: LENOVO 20EQS64N0B/20EQS64N0B, BIOS N1EET82W (1.55 )
> > > > 12/18/2018
> > > > Workqueue: events_long drm_dp_mst_link_probe_work [drm_kms_helper]
> > > > RIP: 0010:gf100_grctx_generate+0x7b2/0x850 [nouveau]
> > > > Code: 85 d2 75 04 48 8b 57 10 48 89 95 28 ff ff ff e8 b4 37 0e e1 48
> > > > 8b
> > > > 95 28 ff ff ff 48 c7 c7 b1 97 57 a0 48 89 c6 e8 5a 38 c0 e0 <0f> 0b e9
> > > > b9 fd ff ff 48 8b 85 60 ff ff ff 48 8b 40 10 48 8b 78 10
> > > > RSP: 0018:ffffc900000b77f0 EFLAGS: 00010286
> > > > RAX: 0000000000000000 RBX: ffff888871af8000 RCX: 0000000000000000
> > > > RDX: ffff88887f41dfe0 RSI: ffff88887f415698 RDI: ffff88887f415698
> > > > RBP: ffffc900000b78c8 R08: 0000000000000000 R09: 0000000000000000
> > > > R10: 0000000000000000 R11: 0000000000000000 R12: ffff888872118000
> > > > R13: 0000000000000000 R14: ffffffffa0551420 R15: ffffc900000b7818
> > > > FS:  0000000000000000(0000) GS:ffff88887f400000(0000)
> > > > knlGS:0000000000000000
> > > > CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > > > CR2: 00005644d0556ca8 CR3: 0000000002214006 CR4: 00000000003606f0
> > > > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> > > > DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> > > > Call Trace:
> > > >  gf100_gr_init_ctxctl+0x27b/0x2d0 [nouveau]
> > > >  gf100_gr_init+0x5bd/0x5e0 [nouveau]
> > > >  gf100_gr_init_+0x61/0x70 [nouveau]
> > > >  nvkm_gr_init+0x1d/0x20 [nouveau]
> > > >  nvkm_engine_init+0xcb/0x210 [nouveau]
> > > >  nvkm_subdev_init+0xd6/0x230 [nouveau]
> > > >  nvkm_engine_ref.part.0+0x52/0x70 [nouveau]
> > > >  nvkm_engine_ref+0x13/0x20 [nouveau]
> > > >  nvkm_ioctl_new+0x12c/0x260 [nouveau]
> > > >  ? nvkm_fifo_chan_child_del+0xa0/0xa0 [nouveau]
> > > >  ? gf100_gr_dtor+0xe0/0xe0 [nouveau]
> > > >  nvkm_ioctl+0xe2/0x180 [nouveau]
> > > >  nvkm_client_ioctl+0x12/0x20 [nouveau]
> > > >  nvif_object_ioctl+0x47/0x50 [nouveau]
> > > >  nvif_object_init+0xc8/0x120 [nouveau]
> > > >  nvc0_fbcon_accel_init+0x5c/0x960 [nouveau]
> > > >  nouveau_fbcon_create+0x5a5/0x5d0 [nouveau]
> > > >  ? drm_setup_crtcs+0x27b/0xcb0 [drm_kms_helper]
> > > >  ? __lock_is_held+0x5e/0xa0
> > > >  __drm_fb_helper_initial_config_and_unlock+0x27c/0x520
> > > > [drm_kms_helper]
> > > >  drm_fb_helper_hotplug_event.part.29+0xae/0xc0 [drm_kms_helper]
> > > >  drm_fb_helper_hotplug_event+0x1c/0x30 [drm_kms_helper]
> > > >  nouveau_fbcon_output_poll_changed+0xb8/0x110 [nouveau]
> > > >  drm_kms_helper_hotplug_event+0x2a/0x40 [drm_kms_helper]
> > > >  drm_dp_send_link_address+0x176/0x1c0 [drm_kms_helper]
> > > >  drm_dp_check_and_send_link_address+0xa0/0xb0 [drm_kms_helper]
> > > >  drm_dp_mst_link_probe_work+0xa4/0xc0 [drm_kms_helper]
> > > >  process_one_work+0x22f/0x5c0
> > > >  worker_thread+0x44/0x3a0
> > > >  kthread+0x12b/0x150
> > > >  ? wq_pool_ids_show+0x140/0x140
> > > >  ? kthread_create_on_node+0x60/0x60
> > > >  ret_from_fork+0x3a/0x50
> > > > irq event stamp: 22490
> > > > hardirqs last  enabled at (22489): [<ffffffff8113281d>]
> > > > console_unlock+0x44d/0x5f0
> > > > hardirqs last disabled at (22490): [<ffffffff81001c03>]
> > > > trace_hardirqs_off_thunk+0x1a/0x1c
> > > > softirqs last  enabled at (22486): [<ffffffff81c00330>]
> > > > __do_softirq+0x330/0x44d
> > > > softirqs last disabled at (22479): [<ffffffff810c3105>]
> > > > irq_exit+0xe5/0xf0
> > > > WARNING: CPU: 0 PID: 12 at
> > > > drivers/gpu/drm/nouveau/nvkm/engine/gr/ctxgf100.c:1547
> > > > gf100_grctx_generate+0x7b2/0x850 [nouveau]
> > > > ---[ end trace bf0976ed88b122a8 ]---
> > > > nouveau 0000:01:00.0: gr: wait for idle timeout (en: 1, ctxsw: 0,
> > > > busy:
> > > > 1)
> > > > nouveau 0000:01:00.0: gr: wait for idle timeout (en: 1, ctxsw: 0,
> > > > busy:
> > > > 1)
> > > > nouveau 0000:01:00.0: fifo: fault 01 [WRITE] at 0000000000008000
> > > > engine
> > > > 00 [GR] client 15 [HUB/SCC_NB] reason c4 [] on channel -1 [0000000000
> > > > unknown]
> > > > 
> > > > From which the GPU never manages to recover. Booting without nouveau
> > > > loading causes issues as well, since the GPU starts sending spurious
> > > > interrupts that cause other device's IRQs to get disabled by the
> > > > kernel:
> > > > 
> > > > irq 16: nobody cared (try booting with the "irqpoll" option)
> > > > …
> > > > handlers:
> > > > [<000000007faa9e99>] i801_isr [i2c_i801]
> > > > Disabling IRQ #16
> > > > …
> > > > serio: RMI4 PS/2 pass-through port at rmi4-00.fn03
> > > > i801_smbus 0000:00:1f.4: Timeout waiting for interrupt!
> > > > i801_smbus 0000:00:1f.4: Transaction timeout
> > > > rmi4_f03 rmi4-00.fn03: rmi_f03_pt_write: Failed to write to F03 TX
> > > > register (-110).
> > > > i801_smbus 0000:00:1f.4: Timeout waiting for interrupt!
> > > > i801_smbus 0000:00:1f.4: Transaction timeout
> > > > rmi4_physical rmi4-00: rmi_driver_set_irq_bits: Failed to change
> > > > enabled
> > > > interrupts!
> > > > 
> > > > Which in turn causes the touchpad and sometimes even other things to
> > > > get
> > > > disabled.
> > > > 
> > > > Since the GPU staying on causes problems even without nouveau's
> > > > intervention, we can't fix this problem from nouveau itself. We have
> > > > to
> > > > fix it as early as possible in the boot sequence in order to make sure
> > > > that the GPU is in a clean state before it has a chance to spam us
> > > > with
> > > > interrupts and break things.
> > > 
> > > Was nouveau loaded *before* the reboot?  Or can you reproduce the
> > > spurious interrupts even if you do a cold reboot without nouveau, then
> > > a warm reboot again without nouveau?
> > 
> > Nouveau was loaded before the reboot, yes, and as far as I'm aware this
> > bug won't show up following the steps you described here. We wouldn't
> > really notice the interrupts if this bug happened without nouveau loaded
> > (since the BIOS doesn't arm them). However, I can confirm it's being
> > reset each on each warm reboot when nouveau's completely disabled by
> > sticking values into some MMIO registers we know the VBIOS doesn't touch
> > followed by rebooting in order to see if they retain their value (this
> > is how I discovered that the GPU wasn't being power cycled in the first
> > place, if the values I stick into said registers gets cleared after
> > reboot then the GPU was reset, otherwise it wasn't).
> > 
> > 
> > > > So to do this, we add a new pci quirk using
> > > > DECLARE_PCI_FIXUP_CLASS_FINAL that will be invoked before the PCI
> > > > probe
> > > > at boot finishes. From there, we check to make sure that this is
> > > > indeed
> > > > the specific P50 variant of this GPU. We also make sure that the GPU
> > > > PCI
> > > > device is advertising NoReset- in order to prevent us from trying to
> > > > reset the GPU when the machine is in Dedicated graphics mode (where
> > > > the
> > > > GPU being initialized by the BIOS is normal and expected). Finally, we
> > > > try mapping the MMIO space for the GPU which should only work if the
> > > > GPU
> > > > is actually active in D0 mode. We can then read the magic 0x2240c
> > > > register on the GPU, which will have bit 1 set if the GPU's firmware
> > > > has
> > > > already been posted during a previous boot. Once we've confirmed all
> > > > of
> > > > this, we reset the PCI device and re-disable it - bringing the GPU
> > > > back
> > > > into a healthy state.
> > > > 
> > > > Signed-off-by: Lyude Paul <lyude@xxxxxxxxxx>
> > > > Cc: nouveau@xxxxxxxxxxxxxxxxxxxxx
> > > > Cc: dri-devel@xxxxxxxxxxxxxxxxxxxxx
> > > > Cc: Karol Herbst <kherbst@xxxxxxxxxx>
> > > > Cc: Ben Skeggs <skeggsb@xxxxxxxxx>
> > > > Cc: stable@xxxxxxxxxxxxxxx
> > > > ---
> > > >  drivers/pci/quirks.c | 65
> > > > ++++++++++++++++++++++++++++++++++++++++++++
> > > >  1 file changed, 65 insertions(+)
> > > > 
> > > > diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
> > > > index b0a413f3f7ca..948492fda8bf 100644
> > > > --- a/drivers/pci/quirks.c
> > > > +++ b/drivers/pci/quirks.c
> > > > @@ -5117,3 +5117,68 @@ SWITCHTEC_QUIRK(0x8573);  /* PFXI 48XG3 */
> > > >  SWITCHTEC_QUIRK(0x8574);  /* PFXI 64XG3 */
> > > >  SWITCHTEC_QUIRK(0x8575);  /* PFXI 80XG3 */
> > > >  SWITCHTEC_QUIRK(0x8576);  /* PFXI 96XG3 */
> > > > +
> > > > +/*
> > > > + * On certain Lenovo Thinkpad P50 SKUs, specifically those with a
> > > > Nvidia
> > > > + * Quadro M1000M, the BIOS will occasionally make the mistake of not
> > > > resetting
> > > > + * the nvidia GPU between reboots if the system is configured to use
> > > > hybrid
> > > > + * graphics mode. This results in the GPU being left in whatever
> > > > state
> > > > it
> > > > was
> > > > + * in during the previous boot which causes spurious interrupts from
> > > > the
> > > > GPU,
> > > > + * which in turn cause us to disable the wrong IRQs and end up
> > > > breaking
> > > > the
> > > > + * touchpad. Unsurprisingly, this also completely breaks nouveau.
> > > > + *
> > > > + * Luckily, it seems a simple reset of the PCI device for the nvidia
> > > > GPU
> > > > + * manages to bring the GPU back into a clean state and fix all of
> > > > these
> > > > + * issues. Additionally since the GPU will report NoReset+ when the
> > > > machine is
> > > > + * configured in Dedicated display mode, we don't need to worry about
> > > > + * accidentally resetting the GPU when it's supposed to already be
> > > > + * initialized.
> > > > + */
> > > > +static void
> > > > +quirk_lenovo_thinkpad_p50_nvgpu_survives_reboot(struct pci_dev *pdev)
> > > > +{
> > > > +	void __iomem *map;
> > > > +	int ret;
> > > > +
> > > > +	if (pdev->subsystem_vendor != PCI_VENDOR_ID_LENOVO ||
> > > > +	    pdev->subsystem_device != 0x222e ||
> > > > +	    !pdev->reset_fn)
> > > > +		return;
> > > > +
> > > > +	/*
> > > > +	 * If we can't enable the device's mmio space, it's probably
> > > > not even
> > > > +	 * initialized. This is fine, and means we can just skip the
> > > > quirk
> > > > +	 * entirely.
> > > > +	 */
> > > > +	if (pci_enable_device_mem(pdev)) {
> > > > +		pci_dbg(pdev, "Can't enable device mem, no reset
> > > > needed\n");
> > > > +		return;
> > > > +	}
> > > > +
> > > > +	/* Taken from drivers/gpu/drm/nouveau/engine/device/base.c */
> > > > +	map = ioremap(pci_resource_start(pdev, 0), 0x102000);
> > > > +	if (!map) {
> > > > +		pci_err(pdev, "Can't map MMIO space, this is probably
> > > > very
> > > > bad\n");
> > > > +		goto out_disable;
> > > > +	}
> > > > +
> > > > +	/*
> > > > +	 * Be extra careful, and make sure that the GPU firmware is
> > > > posted
> > > > +	 * before trying a reset
> > > > +	 */
> > > > +	if (ioread32(map + 0x2240c) & 0x2) {
> > > > +		pci_info(pdev,
> > > > +			 FW_BUG "GPU left initialized by EFI,
> > > > resetting\n");
> > > > +		ret = pci_reset_function(pdev);
> > > > +		if (ret < 0)
> > > > +			pci_err(pdev, "Failed to reset GPU: %d\n",
> > > > ret);
> > > > +	}
> > > > +
> > > > +	iounmap(map);
> > > > +out_disable:
> > > > +	pci_disable_device(pdev);
> > > > +}
> > > > +
> > > > +DECLARE_PCI_FIXUP_CLASS_FINAL(PCI_VENDOR_ID_NVIDIA, 0x13b1,
> > > > +			      PCI_CLASS_DISPLAY_VGA, 8,
> > > > +			      quirk_lenovo_thinkpad_p50_nvgpu_survives
> > > > _reboot)
> > > > ;
> > > > -- 
> > > > 2.20.1
> > > > 
-- 
Cheers,
	Lyude Paul