Re: [Qemu-ppc] [RFC PATCH qemu] spapr_pci: Create PCI-express root bus by default

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 16/11/16 01:02, Andrea Bolognani wrote:
> On Tue, 2016-11-01 at 13:46 +1100, David Gibson wrote:
>> On Mon, Oct 31, 2016 at 03:10:23PM +1100, Alexey Kardashevskiy wrote:
>>>  
>>> On 31/10/16 13:53, David Gibson wrote:
>>>>  
>>>> On Fri, Oct 28, 2016 at 12:07:12PM +0200, Greg Kurz wrote:
>>>>>  
>>>>> On Fri, 28 Oct 2016 18:56:40 +1100
>>>>> Alexey Kardashevskiy <aik@xxxxxxxxx> wrote:
>>>>>  
>>>>>>  
>>>>>> At the moment sPAPR PHB creates a root buf of TYPE_PCI_BUS type.
>>>>>> This means that vfio-pci devices attached to it (and this is
>>>>>> a default behaviour) hide PCIe extended capabilities as
>>>>>> the bus does not pass a pci_bus_is_express(pdev->bus) check.
>>>>>>  
>>>>>> This changes adds a default PCI bus type property to sPAPR PHB
>>>>>> and uses TYPE_PCIE_BUS if none passed; older machines get TYPE_PCI_BUS
>>>>>> for backward compatibility as a bus type is used in the bus name
>>>>>> so the root bus name becomes "pcie.0" instead of "pci.0".
>>>>>>  
>>>>>> Signed-off-by: Alexey Kardashevskiy <aik@xxxxxxxxx>
>>>>>> ---
>>>>>>  
>>>>>> What can possibly go wrong with such change of a name?
>>>>>> From devices prospective, I cannot see any.
>>>>>>  
>>>>>> libvirt might get upset as "pci.0" will not be available,
>>>>>> will it make sense to create pcie.0 as a root bus and always
>>>>>> add a PCIe->PCI bridge and name its bus "pci.0"?
>>>>>>  
>>>>>> Or create root bus from TYPE_PCIE_BUS and force name to "pci.0"?
>>>>>> pci_register_bus() can do this.
>>>>>>  
>>>>>>  
>>>>>> ---
>>>>>>   hw/ppc/spapr.c              | 5 +++++
>>>>>>   hw/ppc/spapr_pci.c          | 5 ++++-
>>>>>>   include/hw/pci-host/spapr.h | 1 +
>>>>>>   3 files changed, 10 insertions(+), 1 deletion(-)
>>>>>>  
>>>>>> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
>>>>>> index 0b3820b..a268511 100644
>>>>>> --- a/hw/ppc/spapr.c
>>>>>> +++ b/hw/ppc/spapr.c
>>>>>> @@ -2541,6 +2541,11 @@ DEFINE_SPAPR_MACHINE(2_8, "2.8", true);
>>>>>>           .driver   = TYPE_SPAPR_PCI_HOST_BRIDGE,     \
>>>>>>           .property = "mem64_win_size",               \
>>>>>>           .value    = "0",                            \
>>>>>> +    },                                              \
>>>>>> +    {                                               \
>>>>>> +        .driver   = TYPE_SPAPR_PCI_HOST_BRIDGE,     \
>>>>>> +        .property = "root_bus_type",                \
>>>>>> +        .value    = TYPE_PCI_BUS,                   \
>>>>>>       },
>>>>>>   
>>>>>>   static void phb_placement_2_7(sPAPRMachineState *spapr, uint32_t index,
>>>>>> diff --git a/hw/ppc/spapr_pci.c b/hw/ppc/spapr_pci.c
>>>>>> index 7cde30e..2fa1f22 100644
>>>>>> --- a/hw/ppc/spapr_pci.c
>>>>>> +++ b/hw/ppc/spapr_pci.c
>>>>>> @@ -1434,7 +1434,9 @@ static void spapr_phb_realize(DeviceState *dev, Error **errp)
>>>>>>       bus = pci_register_bus(dev, NULL,
>>>>>>                              pci_spapr_set_irq, pci_spapr_map_irq, sphb,
>>>>>>                              &sphb->memspace, &sphb->iospace,
>>>>>> -                           PCI_DEVFN(0, 0), PCI_NUM_PINS, TYPE_PCI_BUS);
>>>>>> +                           PCI_DEVFN(0, 0), PCI_NUM_PINS,
>>>>>> +                           sphb->root_bus_type ? sphb->root_bus_type :
>>>>>> +                           TYPE_PCIE_BUS);
>>>>>  
>>>>> Shouldn't we ensure that sphb->root_bus_type is either TYPE_PCIE_BUS or
>>>>> TYPE_PCI_BUS ?
>>>>  
>>>> Yes, I think so.  In fact, I think it would be better to make the
>>>> property a boolean that just selects PCI-E, rather than this which
>>>> exposes qemu (semi-)internal type names on the comamnd line.
>>>  
>>> Sure, a "pcie-root" boolean property should do.
>>>  
>>> However this is not my main concern, I rather wonder if we have to have
>>> pci.0 when we pick PCIe for the root.
>>  
>> Right.
>>  
>> I've added Andrea Bologna to the CC list to get a libvirt perspective.
> 
> Thanks for doing so: changes such as this one can have quite
> an impact on the upper layers of the stack, so the earliest
> libvirt is involved in the discussion the better.
> 
> I'm going to go a step further and cross-post to libvir-list
> in order to give other libvirt contributors a chance to chime
> in too.
> 
>> Andrea,
>>  
>> To summarise the issue here:
>>     * As I've said before the PAPR spec kinda-sorta abstracts the
>>       difference between vanilla PCI and PCI-E
>>     * However, because within qemu we're declaring the bus as PCI that
>>       means some PCI-E devices aren't working right
>>     * In particular it means that PCI-E extended config space isn't
>>       available
>>  
>> The proposal is to change (on newer machine types) the spapr PHB code
>> to declare a PCI-E bus instead.  AIUI this still won't make the root
>> complex guest visible (which it's not supposed to be under PAPR), and
>> the guest shouldn't see a difference in most cases - it will still see
>> the PAPR abstracted PCIish bus, but will now be able to get extended
>> config space.
>>  
>> The possible problem from a libvirt perspective is that doing this in
>> the simplest way in qemu would change the name of the default bus from
>> pci.0 to pcie.0.  We have two suggested ways to mitigate this:
>>     1) Automatically create a PCI-E to PCI bridge, so that new machine
>>        types will have both a pcie.0 and pci.0 bus
>>     2) Force the name of the bus to be pci.0, even though it's treated
>>        as PCI-E in other ways.
>>  
>> We're trying to work out exactly what will and won't cause trouble for
>> libvirt.
> 
> Option 2) is definitely a no-no, as we don't want to be piling
> up even more hacks and architecture-specific code: the PCI
> Express Root Bus should be called pcie.0, just as it is on q35
> and mach-virt machine types.
> 
> Option 1) doesn't look too bad, but devices that are added
> automatically by QEMU are an issue since we need to hardcode
> knowledge of them into libvirt if we want the rest of the PCI
> address allocation logic to handle them correctly.
> 
> Moreover libvirt now has the ability of building a legacy PCI
> topology without user intervention, if needed to plug in
> legacy devices, on machines that have a PCI Express Root Bus,
> which makes the additional bridge fully redundant...
> 
> ... or at least it would, if we actually had a proper
> PCIe-to-PCI bridge; AFAIK, though, the closest we have is the
> i82801b11-bridge that is Intel-specific despite having so far
> been abused as a generic PCIe-to-PCI bridge. I'm not even
> sure whether it would work at all on ppc64.
> 
> Moving from legacy PCI to PCI Express would definitely be an
> improvement, in my opinion. As mentioned, that's already the
> case for at least two other architectures, so the more we can
> standardize on that, the better.
> 
> That said, considering that a big part of the PCI address
> allocation logic is based off whether the specific machine
> type exposes a legay PCI Root Bus or a PCI Express Root Bus,
> libvirt will need a way to be able to tell which one is which.
> 
> Version checks are pretty much out of the question, as they
> fail as soon as downstream releases enter the picture. A
> few ways we could deal with the situation:
> 
>   1) switch to PCI Express on newer machine types, and
>      expose some sort of capability through QMP so that
>      libvirt can know about the switch
> 
>   2) switch between legacy PCI and PCI Express based on a
>      machine type option. libvirt would be able to find out
>      whether the option is available or not, and default to
>      either
> 
>        <controller type='pci' model='pci-root'/>
> 
>      or
> 
>        <controller type='pci' model='pcie-root'/>
> 
>      based on that. In order to support multiple PHBs
>      properly, those would have to be switchable with an
>      option as well
> 
>   3) create an entirely new machine type, eg. pseries-pcie
>      or whatever someone with the ability to come up with
>      decent names can suggest :) That would make ppc64
>      similar to x86, where i440fx and q35 have different
>      root buses. libvirt would learn about the new machine
>      type, know that it has a PCI Express Root Bus, and
>      behave accordingly
> 
> Option 1) would break horribly with existing libvirt
> versions, and so would Option 2) if we default to using


How exactly 1) will break libvirt? Migrating from pseries-2.7 to
pseries-2.8 does not work anyway, and machines are allowed to behave
different from version to version, what distinct difference will using
"pseries-pcie-X.Y" make? I believe after we introduced the very first
pseries-pcie-X.Y, we will just stop adding new pseries-X.Y.



> PCI Express. Option 2) with default to legacy PCI and
> option 3) would work just fine with existing libvirt
> versions AFAICT, but wouldn't of course expose the new
> capabilities.
> 
> Option 3) is probably the one that will be less confusing
> to users; we might even decide to take the chance and fix
> other small annoyances with the current pseries machine
> type, if there's any. On the other hand, it might very well
> be considered to be too big a hammer for such a small nail.



-- 
Alexey

--
libvir-list mailing list
libvir-list@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/libvir-list



[Index of Archives]     [Virt Tools]     [Libvirt Users]     [Lib OS Info]     [Fedora Users]     [Fedora Desktop]     [Fedora SELinux]     [Big List of Linux Books]     [Yosemite News]     [KDE Users]     [Fedora Tools]
  Powered by Linux