Re: Predictable and consistent net interface naming in guests

Laine Stump <laine@xxxxxxxxxx> · Thu, 8 Dec 2022 11:44:14 -0500

On 12/8/22 11:15 AM, Julia Suvorova wrote:
On Thu, Nov 3, 2022 at 9:26 AM Amnon Ilan <ailan@xxxxxxxxxx> wrote:

On Thu, Nov 3, 2022 at 12:13 AM Amnon Ilan <ailan@xxxxxxxxxx> wrote:

On Wed, Nov 2, 2022 at 6:47 PM Laine Stump <laine@xxxxxxxxxx> wrote:

On 11/2/22 11:58 AM, Igor Mammedov wrote:
On Wed, 2 Nov 2022 15:20:39 +0000
Daniel P. Berrangé <berrange@xxxxxxxxxx> wrote:

On Wed, Nov 02, 2022 at 04:08:43PM +0100, Igor Mammedov wrote:
On Wed, 2 Nov 2022 10:43:10 -0400
Laine Stump <laine@xxxxxxxxxx> wrote:

On 11/1/22 7:46 AM, Igor Mammedov wrote:
On Mon, 31 Oct 2022 14:48:54 +0000
Daniel P. Berrangé <berrange@xxxxxxxxxx> wrote:

On Mon, Oct 31, 2022 at 04:32:27PM +0200, Edward Haas wrote:
Hi Igor and Laine,

I would like to revive a 2 years old discussion [1] about consistent network
interfaces in the guest.

That discussion mentioned that a guest PCI address may change in two cases:
- The PCI topology changes.
- The machine type changes.

Usually, the machine type is not expected to change, especially if one
wants to allow migrations between nodes.
I would hope to argue this should not be problematic in practice, because
guest images would be made per a specific machine type.

Regarding the PCI topology, I am not sure I understand what changes
need to occur to the domxml for a defined guest PCI address to change.
The only think that I can think of is a scenario where hotplug/unplug is
used,
but even then I would expect existing devices to preserve their PCI address
and the plug/unplug device to have a reserved address managed by the one
acting on it (the management system).

Could you please help clarify in which scenarios the PCI topology can cause
a mess to the naming of interfaces in the guest?

Are there any plans to add the acpi_index support?

This was implemented a year & a half ago

     https://libvirt.org/formatdomain.html#network-interfaces

though due to QEMU limitations this only works for the old
i440fx chipset, not Q35 yet.

Q35 should work partially too. In its case acpi-index support
is limited to hotplug enabled root-ports and PCIe-PCI bridges.
One also has to enable ACPI PCI hotplug (it's enled by default
on recent machine types) for it to work (i.e.it's not supported
in native PCIe hotplug mode).

So if mgmt can put nics on root-ports/bridges, then acpi-index
should just work on Q35 as well.

With only a few exceptions (e.g. the first ich9 audio device, which is
placed directly on the root bus at 00:1B.0 because that is where the
ich9 audio device is located on actual Q35 hardware), libvirt will
automatically put all PCI devices (including network interfaces) on a
pcie-root-port.

After seeing reports that "acpi index doesn't work with Q35
machinetypes" I just assumed that was correct and didn't try it. But
after seeing the "should work partially" statement above, I tried it
just now and an <interface> of a Q35 guest that had its PCI address
auto-assigned by libvirt (and so was placed on a pcie-root-port)m and
had <acpi index='4'/> was given the name "eno4". So what exactly is it
that *doesn't* work?

  From QEMU side:
acpi-index requires:
   1. acpi pci hotplug enabled (which is default on relatively new q35 machine types)
   2. hotpluggble pci bus (root-port, various pci bridges)
   3. NIC can be cold or hotplugged, guest should pick up acpi-index of the device
      currently plugged into slot
what doesn't work:
   1. device attached to host-bridge directly  (work in progress)
         (q35)
   2. devices attached to any PXB port and any hierarchy hanging of it (there are not plans to make it work)
         (q35, pc)

I'd say this is still a relatively important, as the PXBs are needed
to create a NUMA placement aware topology for guests, and I'd say it
is undesirable to loose acpi-index if a guest is updated to be NUMA
aware, or if a guest image can be deployed in either normal or NUMA
aware setups.

it's not only Q35 but also PC.
We basically do not generate ACPI hierarchy for PXBs at all,
so neither ACPI hotplug nor depended acpi-index would work.
It's been so for many years and no one have asked to enable
ACPI hotplug on them so far.

I'm guessing (based on absolutely 0 information :-)) that there would be
more demand for acpi-index (and the resulting predictable interface
names) than for acpi hotplug for NUMA-aware setup.

My guess is similar, but it is still desirable to have both (i.e. support ACPI-indexing/hotplug with Numa-aware)
Adding @Peter Xu to check if our setups for SAP require NUMA-aware topology

How big of a project would it be to enable ACPI-indexing/hotplug with PXB?

Why would you need to add acpi hotplug on pxb?

Adding +Julia Suvorova and +Tsirkin, Michael to help answer this question

Thanks,
Amnon

Since native PCI was improved, we can still compromise on switching to native-PCI-hotplug when PXB is required (and no fixed indexing)

Native hotplug works on pxb as is, without disabling acpi hotplug.

Are you saying you can add an acpi-index to a device plugged into a pxb, 
that index will be recognized (and used to name the device), but it will 
still do native hotplug?

That sounds okay to me, since it ticks all the functional marks 
(hotplug, consistent device names, NUMA-aware). It's possible there are 
some things I'm misunderstanding or haven't thought of though...

Thanks,
Amnon

Anyway, it sounds like (*within the confines of how libvirt constructs
the PCI topology*) we actually have functional parity of acpi-index
between 440fx and Q35.