Re: [PATCH 12/12] PCI/pciehp: Use device managed allocations

Lukas Wunner <lukas@xxxxxxxxx> · Tue, 25 Sep 2018 09:13:11 +0200

On Mon, Sep 24, 2018 at 06:43:07PM -0500, Bjorn Helgaas wrote:
> On Sat, Sep 22, 2018 at 08:10:57PM +0200, Lukas Wunner wrote:
> > This cannot happen with what's in v4.19 or on the pci/hotplug branch
> > because the IRQ is released right after the call to pci_hp_del().
> > If the button_work happens to be scheduled afterwards, it will call
> > irq_wake_thread(ctrl->pcie->irq, ctrl).  Now if you look at the
> > implementation of irq_wake_thread(), it iterates over all the
> > actions of the IRQ (in case it's shared by multiple devices) and
> > finds the action which matches the ctrl pointer.  But because we've
> > already released the IRQ at that point, it won't find the IRQ action
> > and just return.  So irq_wake_thread() essentially becomes a no-op.
> 
> This seems like a very subtle issue that we're likely to trip over
> again.  I assume most drivers can use device-managed IRQs safely, but
> it's a problem here because of the extra complexity around
> hotplug_slot?  Is there some way we can make it less subtle, e.g., by
> moving the slot cleanup into the pci_destroy_dev() path or something?

The extra complexity is caused by the various ways in which events can
be triggered (Slot Status register change, sysfs, button_work) plus the
deregistration of hotplug_slot.  Teardown paths are always hard, I don't
really see a better code solution, but I should add more code comments to
make it less subtle.

> Is there some value we get from having both struct hotplug_slot and
> struct pci_slot?

When Alex Chiang introduced struct pci_slot, he thought of migrating
struct hotplug_slot into it.  See the "migrate over time" code comment
in commit f46753c5e354.  (It got changed to "move here" by 0aa0f5d1084c.)

On the current pci/hotplug branch, I've embedded struct hotplug_slot
in each hotplug driver's struct slot.  But that doesn't preclude
merging struct hotplug_slot into struct pci_slot, the hotplug drivers
would just have to embed a struct pci_slot instead, or struct pci_slot
would be embedded in struct hotplug_slot.

TBH the rationale behind all the different structs isn't quite clear
to me, e.g. struct pci_slot is also used for non-hotplug slots it seems,
see 8344b568f5bd.

(Note, Alex is no longer reachable under his hp.com address, same for
his canonical.com address.)

> > While I had thought of using devm_kzalloc(), I also knew that Bjorn
> > wants to get rid of portdrv and I didn't want to make any changes that
> > might constrain the future design of portdrv.  Right now, portdrv
> > creates a sub-device for each service and service drivers bind to it.
> > An alternative design could either be a single driver that binds to
> > ports and handles all services at once (probably wouldn't be all that
> > different to the current situation), or no driver would be bound to
> > ports and the functionality of the port services would be handled
> > within the core.  In either case we have the pci_dev of the port to
> > attach the device-managed allocations to, so I suppose using
> > devm_kzalloc() here isn't constraining us all that much, hence
> > seems fine.
> 
> I'd *like* to get rid of portdrv, but I don't see a clear path to do
> that yet, so it might be impractical.  One reason I don't like it is
> that only one driver can bind to a device, so portdrv prevents us from
> having drivers for device-specific bridge features like performance
> monitors.  For that reason, I'd like to have the architected services
> handled directly in the core.

Hm, well in that case we can't use devm_*() at all, as Benjamin
Herrenschmidt has just correctly pointed out, devres_release_all()
is called both on driver unbind in __device_release_driver() as well
as on device destruction in device_release().

Thus, if the port services are handled in the core to allow a driver
to bind to the port and devm_kzalloc() is used by the port services,
those allocations will be freed when that driver unbinds, and any
further use of the allocations is a use-after-free.

Thanks,

Lukas