Patch "PCI/PM: Extend D3hot delay for NVIDIA HDA controllers" has been added to the 6.3-stable tree

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



This is a note to let you know that I've just added the patch titled

    PCI/PM: Extend D3hot delay for NVIDIA HDA controllers

to the 6.3-stable tree which can be found at:
    http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=summary

The filename of the patch is:
     pci-pm-extend-d3hot-delay-for-nvidia-hda-controllers.patch
and it can be found in the queue-6.3 subdirectory.

If you, or anyone else, feels it should not be added to the stable tree,
please let <stable@xxxxxxxxxxxxxxx> know about it.



commit 1d85844e8c2e76234ca974b7feddce3b149cf5f6
Author: Alex Williamson <alex.williamson@xxxxxxxxxx>
Date:   Thu Apr 13 13:40:42 2023 -0600

    PCI/PM: Extend D3hot delay for NVIDIA HDA controllers
    
    [ Upstream commit a5a6dd2624698b6e3045c3a1450874d8c790d5d9 ]
    
    Assignment of NVIDIA Ampere-based GPUs have seen a regression since the
    below referenced commit, where the reduced D3hot transition delay appears
    to introduce a small window where a D3hot->D0 transition followed by a bus
    reset can wedge the device.  The entire device is subsequently unavailable,
    returning -1 on config space read and is unrecoverable without a host
    reset.
    
    This has been observed with RTX A2000 and A5000 GPU and audio functions
    assigned to a Windows VM, where shutdown of the VM places the devices in
    D3hot prior to vfio-pci performing a bus reset when userspace releases the
    devices.  The issue has roughly a 2-3% chance of occurring per shutdown.
    
    Restoring the HDA controller d3hot_delay to the effective value before the
    below commit has been shown to resolve the issue.  NVIDIA confirms this
    change should be safe for all of their HDA controllers.
    
    Fixes: 3e347969a577 ("PCI/PM: Reduce D3hot delay with usleep_range()")
    Link: https://lore.kernel.org/r/20230413194042.605768-1-alex.williamson@xxxxxxxxxx
    Reported-by: Zhiyi Guo <zhguo@xxxxxxxxxx>
    Signed-off-by: Alex Williamson <alex.williamson@xxxxxxxxxx>
    Signed-off-by: Bjorn Helgaas <bhelgaas@xxxxxxxxxx>
    Reviewed-by: Tarun Gupta <targupta@xxxxxxxxxx>
    Cc: Abhishek Sahu <abhsahu@xxxxxxxxxx>
    Cc: Tarun Gupta <targupta@xxxxxxxxxx>
    Signed-off-by: Sasha Levin <sashal@xxxxxxxxxx>

diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
index 44cab813bf951..f4e2a88729fd1 100644
--- a/drivers/pci/quirks.c
+++ b/drivers/pci/quirks.c
@@ -1939,6 +1939,19 @@ static void quirk_radeon_pm(struct pci_dev *dev)
 }
 DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_ATI, 0x6741, quirk_radeon_pm);
 
+/*
+ * NVIDIA Ampere-based HDA controllers can wedge the whole device if a bus
+ * reset is performed too soon after transition to D0, extend d3hot_delay
+ * to previous effective default for all NVIDIA HDA controllers.
+ */
+static void quirk_nvidia_hda_pm(struct pci_dev *dev)
+{
+	quirk_d3hot_delay(dev, 20);
+}
+DECLARE_PCI_FIXUP_CLASS_FINAL(PCI_VENDOR_ID_NVIDIA, PCI_ANY_ID,
+			      PCI_CLASS_MULTIMEDIA_HD_AUDIO, 8,
+			      quirk_nvidia_hda_pm);
+
 /*
  * Ryzen5/7 XHCI controllers fail upon resume from runtime suspend or s2idle.
  * https://bugzilla.kernel.org/show_bug.cgi?id=205587



[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux