Re: [PATCH] drm/amd: Require CONFIG_HOTPLUG_PCI_PCIE for BOCO

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Dec 11, 2024 at 3:19 PM Mario Limonciello <superm1@xxxxxxxxxx> wrote:
>
> On 12/11/2024 14:08, Alex Deucher wrote:
> > On Wed, Dec 11, 2024 at 10:56 AM Mario Limonciello <superm1@xxxxxxxxxx> wrote:
> >>
> >> From: Mario Limonciello <mario.limonciello@xxxxxxx>
> >>
> >> If the kernel hasn't been compiled with PCIe hotplug support this
> >> can lead to problems with dGPUs that use BOCO because they effectively
> >> drop off the bus.
> >>
> >> To prevent issues, disable BOCO support when compiled without PCIe hotplug.
> >>
> >> Reported-by: Gabriel Marcano <gabemarcano@xxxxxxxxx>
> >> Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/1707#note_2696862
> >> Signed-off-by: Mario Limonciello <mario.limonciello@xxxxxxx>
> >
> > Acked-by: Alex Deucher <alexander.deucher@xxxxxxx>
>
> Thx.
>
> >
> > Seems like this should affect any device which supports d3cold.  Maybe
> > we want something more general as well?
>
> Any specific ideas?  One of these two hunks I think make sense, leaning
> upon the second one more strongly.

Actually, I wonder if the affected hardware pre-dates d3cold and uses
the old proprietary AMD ATPX interface to control dGPU power.  In that
case, the d3cold is managed by the driver rather than the PCI/ACPI
subsystems.  IIRC, there was a workaround in the PCIe hotplug code to
avoid calling the pci remove function when the driver powered down the
GPU via ATPX (or the nvidia equivalent).  If so, this check should go
in amdgpu_device_supports_px() instead.

Alex

>
>
>
>                                    diff --git a/drivers/pci/pci.c
> b/drivers/pci/pci.c
> index 0b29ec6e8e5e2..01691f1d26fbe 100644
> --- a/drivers/pci/pci.c
> +++ b/drivers/pci/pci.c
> @@ -2751,6 +2751,10 @@ int pci_prepare_to_sleep(struct pci_dev *dev)
>          if (target_state == PCI_POWER_ERROR)
>                  return -EIO;
>
> +       if (!IS_ENABLED(CONFIG_HOTPLUG_PCI_PCIE) &&
> +           target_state == PCI_D3cold)
> +               return -EBUSY;
> +
>          pci_enable_wake(dev, target_state, wakeup);
>
>          error = pci_set_power_state(dev, target_state);
> @@ -2797,6 +2801,10 @@ int pci_finish_runtime_suspend(struct pci_dev *dev)
>          if (target_state == PCI_POWER_ERROR)
>                  return -EIO;
>
> +       if (!IS_ENABLED(CONFIG_HOTPLUG_PCI_PCIE) &&
> +            target_state == PCI_D3cold)
> +               return -EBUSY;
> +
>          __pci_enable_wake(dev, target_state, pci_dev_run_wake(dev));
>
>          error = pci_set_power_state(dev, target_state);
> >
> > Alex
> >
> >> ---
> >>   drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 3 +++
> >>   1 file changed, 3 insertions(+)
> >>
> >> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> >> index 764d41434a82f..7db796ebb967e 100644
> >> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> >> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> >> @@ -419,6 +419,9 @@ bool amdgpu_device_supports_boco(struct drm_device *dev)
> >>   {
> >>          struct amdgpu_device *adev = drm_to_adev(dev);
> >>
> >> +       if (!IS_ENABLED(CONFIG_HOTPLUG_PCI_PCIE))
> >> +               return false;
> >> +
> >>          if (adev->has_pr3 ||
> >>              ((adev->flags & AMD_IS_PX) && amdgpu_is_atpx_hybrid()))
> >>                  return true;
> >> --
> >> 2.43.0
> >>
>




[Index of Archives]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux