On 21/09/2022 18:39, Rodrigo Vivi wrote:
The force_probe protection actively avoids the probe of i915 to
manage a device that is currently under development. It is a nice
protection for future users when getting a new platform but using
some older kernel.
However, when we avoid the probe we don't take back the registration
of the device. We cannot give up the registration anyway since we can
have multiple devices present. For instance an integrated and a discrete
one.
When this scenario occurs, the user will not be able to change any
of the runtime pm configuration of the unmanaged device. So, it will
be blocked in D0 state wasting power. This is specially bad in the
case where we have a discrete platform attached, but the user is
able to fully use the integrated one for everything else.
So, let's put the protected and unmanaged device in D3. So we can
save some power.
Reported-by: Daniel J Blueman <daniel@xxxxxxxxx>
Cc: stable@xxxxxxxxxxxxxxx
Cc: Daniel J Blueman <daniel@xxxxxxxxx>
Cc: Tvrtko Ursulin <tvrtko.ursulin@xxxxxxxxx>
Cc: Anshuman Gupta <anshuman.gupta@xxxxxxxxx>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@xxxxxxxxx>
---
drivers/gpu/drm/i915/i915_pci.c | 8 ++++++++
1 file changed, 8 insertions(+)
diff --git a/drivers/gpu/drm/i915/i915_pci.c b/drivers/gpu/drm/i915/i915_pci.c
index 77e7df21f539..fc3e7c69af2a 100644
--- a/drivers/gpu/drm/i915/i915_pci.c
+++ b/drivers/gpu/drm/i915/i915_pci.c
@@ -25,6 +25,7 @@
#include <drm/drm_color_mgmt.h>
#include <drm/drm_drv.h>
#include <drm/i915_pciids.h>
+#include <linux/pm_runtime.h>
#include "gt/intel_gt_regs.h"
#include "gt/intel_sa_media.h"
@@ -1304,6 +1305,7 @@ static int i915_pci_probe(struct pci_dev *pdev, const struct pci_device_id *ent)
{
struct intel_device_info *intel_info =
(struct intel_device_info *) ent->driver_data;
+ struct device *kdev = &pdev->dev;
int err;
if (intel_info->require_force_probe &&
@@ -1314,6 +1316,12 @@ static int i915_pci_probe(struct pci_dev *pdev, const struct pci_device_id *ent)
"module parameter or CONFIG_DRM_I915_FORCE_PROBE=%04x configuration option,\n"
"or (recommended) check for kernel updates.\n",
pdev->device, pdev->device, pdev->device);
+
+ /* Let's not waste power if we are not managing the device */
+ pm_runtime_use_autosuspend(kdev);
+ pm_runtime_allow(kdev);
+ pm_runtime_put_autosuspend(kdev);
This sequence is black magic to me so can't really comment on the specifics. But in general, what I think I've figured out is, that the PCI core calls our runtime resume callback before probe:
local_pci_probe:
...
/*
* Unbound PCI devices are always put in D0, regardless of
* runtime PM status. During probe, the device is set to
* active and the usage count is incremented. If the driver
* supports runtime PM, it should call pm_runtime_put_noidle(),
* or any other runtime PM helper function decrementing the usage
* count, in its probe routine and pm_runtime_get_noresume() in
* its remove routine.
*/
pm_runtime_get_sync(dev);
pci_dev->driver = pci_drv;
rc = pci_drv->probe(pci_dev, ddi->id);
if (!rc)
return rc;
if (rc < 0) {
pci_dev->driver = NULL;
pm_runtime_put_sync(dev);
return rc;
}
And if probe fails it calls pm_runtime_put_sync which presumably does not provide the symmetry we need?
Anyway since I can't provide meaningful review I'll copy Imre since I think he worked in the area in the past. Just so more eyes is better.
Regards,
Tvrtko
+
return -ENODEV;
}