RE: [PATCH] PCI: pci-hyperv: Retry PCI bus D0 entry when the first attempt failed with invalid device state 0xC0000184.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



From: Wei Hu <weh@xxxxxxxxxxxxx>  Sent: Monday, April 27, 2020 11:42 PM
> > From: Michael Kelley <mikelley@xxxxxxxxxxxxx>
> > > > The above is good.  But there's another error case that isn't
> > > > handled correctly.  If create_root_hv_pci_bus() fails,
> > > > hv_send_resources_released() should be called.
> > > >
> > > > Fixing these two error cases should probably go in a separate patch:
> > > > One patch for the retry problem in kdump, and a separate patch for
> > > > these error cases.
> > >
> > > [Wei Hu]
> > > hv_send_resources_released() is called in the added hv_pci_bus_exit().
> > > If hv_send_resources_allocated() fails, is it correct to call
> > hv_send_resources_released()?
> > > Allocation can fail in the middle. So I am not sure if calling
> > > hv_send_resources_released() won't cause any side effect.
> > >
> >
> > Ah yes, you are right.  But that brings up a separate problem.
> > If hv_pci_allocate_bridge_windows() or hv_send_resources_allocated() fails,
> > then the error path will call hv_pci_bus_exit(), which will call
> > hv_send_resources_released(), even if hv_send_resources_allocated() was
> > never called or didn't fully succeed.  As you noted,
> > hv_send_resources_allocated() does multiple steps, some of which might have
> > succeeded, and some of which didn't.  The mismatch might cause problems.
> > That means fixing this error handling is going to be a bit more complex.  Each
> > operation needs
> > to be individually undone, and only if it previously succeeded.   Could you
> > follow up
> > with the Hyper-V people to see if there's a problem with doing the
> > RESOURCES_RELEASED message on a slot where RESOURCES_ASSIGNED was
> > not done or wasn't successful?
> > If doing a spurious RESOURCES_RELEASED is harmless, that will make the error
> > cleanup easier.
> >
> 
> I think we can clean this up by doing like following, without the need of consulting
> Hyper-V team. The kdump retry also works with this by setting the wslot_res_allocated
> to 255 before calling hv_pci_bus_exit() to retry. Let me know what you think.

I like the overall approach.  I've reviewed your code below, and I think it works.
For the kdump situation, the assumption is that if we get the failure in enter_d0(),
then the PCI device must have been successfully set up in the main kernel.  The
occupied slots found by the kdump kernel must the same as the occupied slots
that were found (and setup) by the main kernel.  Therefore it is OK to iterate
through all 256 slots in hv_send_resources_released() on the retry path.  Make
sure to add a comment with that reasoning. :-)

Michael

> 
> diff --git a/drivers/pci/controller/pci-hyperv.c b/drivers/pci/controller/pci-hyperv.c
> index 0a42c228b231..06f31f5777f9 100644
> --- a/drivers/pci/controller/pci-hyperv.c
> +++ b/drivers/pci/controller/pci-hyperv.c
> @@ -480,6 +480,9 @@ struct hv_pcibus_device {
> 
>         struct workqueue_struct *wq;
> 
> +       /* Highest slot of child device allocated resources allocated*/
> +       int wslot_res_allocated;
> +
>         /* hypercall arg, must not cross page boundary */
>         struct hv_retarget_device_interrupt retarget_msi_interrupt_params;
> 
> @@ -2873,7 +2876,7 @@ static int hv_send_resources_allocated(struct hv_device *hdev)
>         struct hv_pci_dev *hpdev;
>         struct pci_packet *pkt;
>         size_t size_res;
> -       u32 wslot;
> +       int wslot;
>         int ret;
> 
>         size_res = (hbus->protocol_version < PCI_PROTOCOL_VERSION_1_2)
> @@ -2926,6 +2929,8 @@ static int hv_send_resources_allocated(struct hv_device *hdev)
>                                 comp_pkt.completion_status);
>                         break;
>                 }
> +
> +               hbus->wslot_res_allocated = wslot;
>         }
> 
>         kfree(pkt);
> @@ -2944,10 +2949,10 @@ static int hv_send_resources_released(struct hv_device *hdev)
>         struct hv_pcibus_device *hbus = hv_get_drvdata(hdev);
>         struct pci_child_message pkt;
>         struct hv_pci_dev *hpdev;
> -       u32 wslot;
> +       int wslot;
>         int ret;
> 
> -       for (wslot = 0; wslot < 256; wslot++) {
> +       for (wslot = hbus->wslot_res_allocated; wslot >= 0; wslot--) {
>                 hpdev = get_pcichild_wslot(hbus, wslot);
>                 if (!hpdev)
>                         continue;
> @@ -2962,8 +2967,12 @@ static int hv_send_resources_released(struct hv_device *hdev)
>                                        VM_PKT_DATA_INBAND, 0);
>                 if (ret)
>                         return ret;
> +
> +               hbus->wslot_res_allocated = wslot - 1;
>         }
> 
> +       hbus->wslot_res_allocated = -1;
> +
>         return 0;
>  }
> 
> @@ -3063,6 +3072,7 @@ static int hv_pci_probe(struct hv_device *hdev,
>         if (!hbus)
>                 return -ENOMEM;
>         hbus->state = hv_pcibus_init;
> +       hbus->wslot_res_allocated = -1;
> 
>         /*
>          * The PCI bus "domain" is what is called "segment" in ACPI and other
> @@ -3162,7 +3172,7 @@ static int hv_pci_probe(struct hv_device *hdev,
> 
>         ret = hv_pci_allocate_bridge_windows(hbus);
>         if (ret)
> -               goto free_irq_domain;
> +               goto exit_d0;
> 
>         ret = hv_send_resources_allocated(hdev);
>         if (ret)
> @@ -3180,6 +3190,8 @@ static int hv_pci_probe(struct hv_device *hdev,
> 
>  free_windows:
>         hv_pci_free_bridge_windows(hbus);
> +exit_d0:
> +       (void) hv_pci_bus_exit(hdev, true);
>  free_irq_domain:
>         irq_domain_remove(hbus->irq_domain);
>  free_fwnode:




[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux