Re: [PATCH RESEND] PCI: s390: Fix use-after-free of PCI bus resources with s390 per-function hotplug

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Feb 14, 2023 at 10:49:10AM +0100, Niklas Schnelle wrote:
> On s390 PCI functions may be hotplugged individually even when they
> belong to a multi-function device. In particular on an SR-IOV device VFs
> may be removed and later re-added.

Is there something special about the SR-IOV/VF case that relates to
this problem?  If not, it might be unnecessary distraction to mention
it.

> In commit a50297cf8235 ("s390/pci: separate zbus creation from
> scanning") it was missed however that struct pci_bus and struct
> zpci_bus's resource list retained a reference to the PCI functions MMIO
> resources even though those resources are released and freed on
> hot-unplug. These stale resources may subsequently be claimed when the
> PCI function re-appears resulting in use-after-free.

Lifetimes of all these resources definitely aren't obvious to me.

So I guess the critical thing here is the new
pci_bus_remove_resource() in zpci_cleanup_bus_resources(), which
removes (and kfrees when necessary) the resource from
pci_bus->resources.

I'm not clear on where the zpci_bus resource list comes in.  I guess
we kalloc resources in zpci_setup_bus_resources(), and the current
code adds them to zpci_bus->resources and copies them onto the pci_bus
list.

The new code does not add them to zpci_bus->resources at all, and only
adds them to the pci_bus resource list.  Right?  I guess maybe that's
what the "no need to add the MMIO resources at all" below refers to?

> One idea of fixing this use-after-free in s390 specific code that was
> investigated was to simply keep resources around from the moment a PCI
> function first appeared until the whole virtual PCI bus created for
> a multi-function device disappears. The problem with this however is
> that due to the requirement of artificial MMIO addreesses (address
> cookies) we will then need extra logic and tracking in struct zpci_bus
> to keep these compatible for re-use. At the same time the MMIO resources
> semantically belong to the PCI function so tying their lifecycle to the
> function seems more logical.
> 
> Instead a simpler approach is to remove the resources of an individually
> hot-unplugged PCI function from the PCI bus's resource list while
> keeping the resources of other PCI functions on the PCI bus untouched.

Do we currently never kfree the pci_bus resource list until we free
the whole pci_bus via release_pcibus_dev()?  Does a remove + add just
allocate more resources that are probably duplicates of what the
pci_bus already had?

> This is done by introducing pci_bus_remove_resource() to remove an
> individual resource. Similarly the resource also needs to be removed
> from the struct zpci_bus's resource list. It turns out however, that
> there is really no need to add the MMIO resources at all and instead we
> can simply use the zpci_bar_struct's resource pointer directly.
> 
> Fixes: a50297cf8235 ("s390/pci: separate zbus creation from scanning")
> Signed-off-by: Niklas Schnelle <schnelle@xxxxxxxxxxxxx>

Other random questions unrelated to this patch:

  - zpci_bus_create_pci_bus() calls pci_bus_add_devices().  Isn't that
    pointless?  AFAICT, the bus->devices list is empty then.

  - What about zpci_bus_scan_device()?  Why does it call both
    pci_bus_add_device() and pci_bus_add_devices()?  The latter will
    just call the former, so it looks redundant.  And the latter is
    locked but not the former?

  - Struct zpci_bus has a "resources" list.  I guess this contains the
    &zbus->bus_resource put there in zpci_bus_alloc(), plus an entry
    for every BAR of every device on the bus (I guess you'd never see
    an actual PCI-to-PCI bridge on s390?), kalloc'ed in
    zpci_setup_bus_resources()?

    What happens when zpci_bus_release() calls
    pci_free_resource_list() on &zbus->resources?  It looks like that
    ultimately calls kfree(), which is OK for the
    zpci_setup_bus_resources() stuff, but what about the
    zbus->bus_resource that was not kalloc'ed?

> ---
>  arch/s390/pci/pci.c     | 16 ++++++++++------
>  arch/s390/pci/pci_bus.c | 12 +++++-------
>  arch/s390/pci/pci_bus.h |  3 +--
>  drivers/pci/bus.c       | 23 +++++++++++++++++++++++
>  include/linux/pci.h     |  1 +
>  5 files changed, 40 insertions(+), 15 deletions(-)
> 
> diff --git a/arch/s390/pci/pci.c b/arch/s390/pci/pci.c
> index ef38b1514c77..e16afacc8fd1 100644
> --- a/arch/s390/pci/pci.c
> +++ b/arch/s390/pci/pci.c
> @@ -544,8 +544,7 @@ static struct resource *__alloc_res(struct zpci_dev *zdev, unsigned long start,
>  	return r;
>  }
>  
> -int zpci_setup_bus_resources(struct zpci_dev *zdev,
> -			     struct list_head *resources)
> +int zpci_setup_bus_resources(struct zpci_dev *zdev)
>  {
>  	unsigned long addr, size, flags;
>  	struct resource *res;
> @@ -581,7 +580,6 @@ int zpci_setup_bus_resources(struct zpci_dev *zdev,
>  			return -ENOMEM;
>  		}
>  		zdev->bars[i].res = res;
> -		pci_add_resource(resources, res);
>  	}
>  	zdev->has_resources = 1;
>  
> @@ -590,17 +588,23 @@ int zpci_setup_bus_resources(struct zpci_dev *zdev,
>  
>  static void zpci_cleanup_bus_resources(struct zpci_dev *zdev)
>  {
> +	struct resource *res;
>  	int i;
>  
> +	pci_lock_rescan_remove();

What exactly is this protecting?  This doesn't seem like quite the
right place since we're not adding/removing a pci_dev here.  Is this
to protect the bus->resources list in pci_bus_remove_resource()?

>  	for (i = 0; i < PCI_STD_NUM_BARS; i++) {
> -		if (!zdev->bars[i].size || !zdev->bars[i].res)
> +		res = zdev->bars[i].res;
> +		if (!res)
>  			continue;
>  
> +		release_resource(res);
> +		pci_bus_remove_resource(zdev->zbus->bus, res);
>  		zpci_free_iomap(zdev, zdev->bars[i].map_idx);
> -		release_resource(zdev->bars[i].res);
> -		kfree(zdev->bars[i].res);
> +		zdev->bars[i].res = NULL;
> +		kfree(res);
>  	}
>  	zdev->has_resources = 0;
> +	pci_unlock_rescan_remove();
>  }
>  
>  int pcibios_device_add(struct pci_dev *pdev)
> diff --git a/arch/s390/pci/pci_bus.c b/arch/s390/pci/pci_bus.c
> index 6a8da1b742ae..a99926af2b69 100644
> --- a/arch/s390/pci/pci_bus.c
> +++ b/arch/s390/pci/pci_bus.c
> @@ -41,9 +41,7 @@ static int zpci_nb_devices;
>   */
>  static int zpci_bus_prepare_device(struct zpci_dev *zdev)
>  {
> -	struct resource_entry *window, *n;
> -	struct resource *res;
> -	int rc;
> +	int rc, i;
>  
>  	if (!zdev_enabled(zdev)) {
>  		rc = zpci_enable_device(zdev);
> @@ -57,10 +55,10 @@ static int zpci_bus_prepare_device(struct zpci_dev *zdev)
>  	}
>  
>  	if (!zdev->has_resources) {
> -		zpci_setup_bus_resources(zdev, &zdev->zbus->resources);
> -		resource_list_for_each_entry_safe(window, n, &zdev->zbus->resources) {
> -			res = window->res;
> -			pci_bus_add_resource(zdev->zbus->bus, res, 0);
> +		zpci_setup_bus_resources(zdev);
> +		for (i = 0; i < PCI_STD_NUM_BARS; i++) {
> +			if (zdev->bars[i].res)
> +				pci_bus_add_resource(zdev->zbus->bus, zdev->bars[i].res, 0);
>  		}
>  	}
>  
> diff --git a/arch/s390/pci/pci_bus.h b/arch/s390/pci/pci_bus.h
> index e96c9860e064..af9f0ac79a1b 100644
> --- a/arch/s390/pci/pci_bus.h
> +++ b/arch/s390/pci/pci_bus.h
> @@ -30,8 +30,7 @@ static inline void zpci_zdev_get(struct zpci_dev *zdev)
>  
>  int zpci_alloc_domain(int domain);
>  void zpci_free_domain(int domain);
> -int zpci_setup_bus_resources(struct zpci_dev *zdev,
> -			     struct list_head *resources);
> +int zpci_setup_bus_resources(struct zpci_dev *zdev);
>  
>  static inline struct zpci_dev *zdev_from_bus(struct pci_bus *bus,
>  					     unsigned int devfn)
> diff --git a/drivers/pci/bus.c b/drivers/pci/bus.c
> index 83ae838ceb5f..f021f1d4af9f 100644
> --- a/drivers/pci/bus.c
> +++ b/drivers/pci/bus.c
> @@ -76,6 +76,29 @@ struct resource *pci_bus_resource_n(const struct pci_bus *bus, int n)
>  }
>  EXPORT_SYMBOL_GPL(pci_bus_resource_n);
>  
> +void pci_bus_remove_resource(struct pci_bus *bus, struct resource *res)
> +{
> +	struct pci_bus_resource *bus_res, *tmp;
> +	int i;
> +
> +	for (i = 0; i < PCI_BRIDGE_RESOURCE_NUM; i++) {
> +		if (bus->resource[i] == res) {
> +			bus->resource[i] = NULL;
> +			return;
> +		}
> +	}
> +
> +	list_for_each_entry_safe(bus_res, tmp, &bus->resources, list) {
> +		if (bus_res->res == res) {
> +			list_del(&bus_res->list);
> +			kfree(bus_res);
> +			return;
> +		}
> +	}
> +	return;
> +

Superfluous "return" and blank line.

> +}
> +
>  void pci_bus_remove_resources(struct pci_bus *bus)
>  {
>  	int i;
> diff --git a/include/linux/pci.h b/include/linux/pci.h
> index adffd65e84b4..3b1974e2ec73 100644
> --- a/include/linux/pci.h
> +++ b/include/linux/pci.h
> @@ -1436,6 +1436,7 @@ void pci_bus_add_resource(struct pci_bus *bus, struct resource *res,
>  			  unsigned int flags);
>  struct resource *pci_bus_resource_n(const struct pci_bus *bus, int n);
>  void pci_bus_remove_resources(struct pci_bus *bus);
> +void pci_bus_remove_resource(struct pci_bus *bus, struct resource *res);
>  int devm_request_pci_bus_resources(struct device *dev,
>  				   struct list_head *resources);
>  
> -- 
> 2.37.2
> 



[Index of Archives]     [DMA Engine]     [Linux Coverity]     [Linux USB]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Greybus]

  Powered by Linux