Re: [PATCH v1 2/2] s390x/pci: Fix hotplugging of PCI bridges

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 23.01.19 14:42, Thomas Huth wrote:
> On 2019-01-23 12:42, David Hildenbrand wrote:
>> On 23.01.19 12:16, Thomas Huth wrote:
>>> On 2019-01-22 13:51, David Hildenbrand wrote:
>>>> When hotplugging a PCI bridge right now to the root port, we resolve
>>>> pci_get_bus(pdev)->parent_dev, which results in a SEGFAULT. Hotplugging
>>>> really only works right now when hotplugging to another bridge.
>>>>
>>>> Instead, we have to properly check if we are already at the root.
>>>>
>>>> Let's cleanup the code while at it a bit and factor out updating the
>>>> subordiante bus number into a separate function. The check for
>>>> "old_nr < nr" is right now not strictly necessary, but makes it more
>>>> obvious what is actually going on.
>>>>
>>>> Signed-off-by: David Hildenbrand <david@xxxxxxxxxx>
>>>> ---
>>>>  hw/s390x/s390-pci-bus.c | 28 +++++++++++++++++++---------
>>>>  1 file changed, 19 insertions(+), 9 deletions(-)
>>>>
>>>> diff --git a/hw/s390x/s390-pci-bus.c b/hw/s390x/s390-pci-bus.c
>>>> index b7c4613fde..9b5c5fff60 100644
>>>> --- a/hw/s390x/s390-pci-bus.c
>>>> +++ b/hw/s390x/s390-pci-bus.c
>>>> @@ -843,6 +843,21 @@ static void s390_pcihost_pre_plug(HotplugHandler *hotplug_dev, DeviceState *dev,
>>>>      }
>>>>  }
>>>>  
>>>> +static void s390_pci_update_subordinate(PCIDevice *dev, uint32_t nr)
>>>> +{
>>>> +    uint32_t old_nr;
>>>> +
>>>> +    pci_default_write_config(dev, PCI_SUBORDINATE_BUS, nr, 1);
>>>> +    while (!pci_bus_is_root(pci_get_bus(dev))) {
>>>> +        dev = pci_get_bus(dev)->parent_dev;
>>>> +
>>>> +        old_nr = pci_default_read_config(dev, PCI_SUBORDINATE_BUS, 1);
>>>> +        if (old_nr < nr) {
>>>> +            pci_default_write_config(dev, PCI_SUBORDINATE_BUS, nr, 1);
>>>> +        }
>>>> +    }
>>>> +}
>>>> +
>>>>  static void s390_pcihost_plug(HotplugHandler *hotplug_dev, DeviceState *dev,
>>>>                                Error **errp)
>>>>  {
>>>> @@ -851,26 +866,21 @@ static void s390_pcihost_plug(HotplugHandler *hotplug_dev, DeviceState *dev,
>>>>      S390PCIBusDevice *pbdev = NULL;
>>>>  
>>>>      if (object_dynamic_cast(OBJECT(dev), TYPE_PCI_BRIDGE)) {
>>>> -        BusState *bus;
>>>>          PCIBridge *pb = PCI_BRIDGE(dev);
>>>> -        PCIDevice *pdev = PCI_DEVICE(dev);
>>>>  
>>>> +        pdev = PCI_DEVICE(dev);
>>>
>>> Why not keep the direct assignment?
>>
>> We have the same local variable in that function already. I'm just
>> reusing it.
> 
> D'oh, right. Please disregard my comment.
> 
>>>
>>>>          pci_bridge_map_irq(pb, dev->id, s390_pci_map_irq);
>>>>          pci_setup_iommu(&pb->sec_bus, s390_pci_dma_iommu, s);
>>>>  
>>>> -        bus = BUS(&pb->sec_bus);
>>>> -        qbus_set_hotplug_handler(bus, DEVICE(s), errp);
>>>> +        qbus_set_hotplug_handler(BUS(&pb->sec_bus), DEVICE(s), errp);
>>>>  
>>>>          if (dev->hotplugged) {
>>>>              pci_default_write_config(pdev, PCI_PRIMARY_BUS,
>>>>                                       pci_dev_bus_num(pdev), 1);
>>>>              s->bus_no += 1;
>>>>              pci_default_write_config(pdev, PCI_SECONDARY_BUS, s->bus_no, 1);
>>>> -            do {
>>>> -                pdev = pci_get_bus(pdev)->parent_dev;
>>>> -                pci_default_write_config(pdev, PCI_SUBORDINATE_BUS,
>>>> -                                         s->bus_no, 1);
>>>> -            } while (pci_get_bus(pdev) && pci_dev_bus_num(pdev));
>>>> +
>>>> +            s390_pci_update_subordinate(pdev, s->bus_no);
>>>
>>>
>>> While your changes look OK at a first glance, and certainly fix the
>>> crash, I wonder whether this is the right thing to do here at all. Why
>>> does the hypervisor set up the all the bridge registers here? I'd rather
>>> expect that this is the job of the guest after it detects a hot-plugged
>>> device?
>>
>> I honestly have no idea who is responsible for that. I can see that
>> spapr does not seem to handle hotplugging the way s390x does. I was
>> hoping that Colin maybe know why we have to do that on s390x this way.
>> At least this patch does not change the behavior but only fixes it.
> 
> Right, you're patch certainly fixes a crash, and does not make it any
> worse than it was before, so:
> 
> Reviewed-by: Thomas Huth <thuth@xxxxxxxxxx>
> 
>>>
>>> Also I'm not sure whether the numbering is right in every case. Let's
>>> use the example from the URL that you've mentioned in the cover letter
>>> (http://www.science.unitn.it/~fiorella/guidelinux/tlk/node76.html step 4):
>>>
>>> Assume that you hot-plug another bridge "5" to "Bridge 2". If I get the
>>> code right, the setup will now look like this:
>>
>> Yes, this is what I would expect! This implies that any bus in the
>> hierarchy lies between Sec and Sub. But that there might be multiple
>> candidates, which feels wrong as you correctly say.
>>
>>>
>>>                CPU
>>>                 |
>>>   --------------+-------------- Bus 0
>>>                 |
>>>                 |  Pri=0
>>>                BR1 Sec=1
>>>                 |  Sub=5
>>>                 |
>>>   ------+-------+----------+--- Bus 1
>>>         |                  |
>>>         |  Pri=1           |  Pri=1
>>>        BR3 Sec=3          BR2 Sec=2
>>>         |  Sub=4           |  Sub=5
>>>         |                  |
>>>   --+---+--- Bus 3       --+-+- Bus 2
>>>     |                        |
>>>     |  Pri=3                 |  Pri=2
>>>    BR4 Sec=4                BR5 Sec=5
>>>     |  Sub=4                 |  Sub=5
>>>     |                        |
>>>   --+---- Bus 4            --+-- Bus 5
>>>
>>> Requests to bus 3 and 4 are now also routed through to bus 2 and 5. That
>>> sounds wrong to me. Or is this how hot-plugging of PCI bridges is
>>> supposed to work??
>>>
>>
>> Learning more about PCI
>>
>> Quoting from [2]
>>
>> "PCIe enumeration is generally done twice. First, your BIOS (UEFI or
>> otherwise) will do it, to figure out who's present and how much memory
>> they need. This data can then be passed on to the host OS who can take
>> it as-is, but Linux and Windows often perform their own enumeration
>> procedure ..."
>>
>> "The custom software part comes in during this enumeration process and
>> that is you must reserve ahead of time PCI Bus numbers, and memory
>> segments for potential future devices -- this is sometimes called 'bus
>> padding'. This avoids the need to re-enumerate the bus in the future
>> which can often not be done without disruption to the system ..."
>>
>> So if I understand correctly, the BIOS/Firmware/QEMU only provides the
>> initial state. After that, the guest is responsible to manage it,
>> especially to manage hotplug.
>>
>> However, I wonder if it would also be up to BIOS/Firmware/QEMU to do
>> this bus padding or if it is completely on the OS side. And if
>> hotplugged devices simply have all 0's.
> 
> Anyway, the current behavior sounds pretty wrong to me. I am sure that
> we can not simply traverse the bridge hierarchy up to the top and change
> the subordinate bus setting everywhere without informing the guest about
> that change. What if the guest OS has cached the values from the
> bridges? It then might use the wrong value for future calculations...

Yes, it also sounds wrong to me. Maybe this is what Firmware manages on
s390x - maybe they reserve bus numbers. Or maybe hotplug has to really
be handled by the guests. Only IBM people can clarify that.

Thanks!

-- 

Thanks,

David / dhildenb



[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Kernel Development]     [Kernel Newbies]     [IDE]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite Info]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux ATA RAID]     [Samba]     [Linux Media]     [Device Mapper]

  Powered by Linux