On Tue, 31 Dec 2019 16:12:36 +0000 Haiyang Zhang <haiyangz@xxxxxxxxxxxxx> wrote: > > -----Original Message----- > > From: Roman Kagan <rkagan@xxxxxxxxxxxxx> > > Sent: Tuesday, December 31, 2019 6:35 AM > > To: Haiyang Zhang <haiyangz@xxxxxxxxxxxxx> > > Cc: sashal@xxxxxxxxxx; linux-hyperv@xxxxxxxxxxxxxxx; netdev@xxxxxxxxxxxxxxx; > > KY Srinivasan <kys@xxxxxxxxxxxxx>; Stephen Hemminger > > <sthemmin@xxxxxxxxxxxxx>; olaf@xxxxxxxxx; vkuznets > > <vkuznets@xxxxxxxxxx>; davem@xxxxxxxxxxxxx; linux-kernel@xxxxxxxxxxxxxxx > > Subject: Re: [PATCH V2,net-next, 3/3] hv_netvsc: Name NICs based on vmbus > > offer sequence and use async probe > > > > On Mon, Dec 30, 2019 at 12:13:34PM -0800, Haiyang Zhang wrote: > > > The dev_num field in vmbus channel structure is assigned to the first > > > available number when the channel is offered. So netvsc driver uses it > > > for NIC naming based on channel offer sequence. Now re-enable the > > > async probing mode for faster probing. > > > > > > Signed-off-by: Haiyang Zhang <haiyangz@xxxxxxxxxxxxx> > > > --- > > > drivers/net/hyperv/netvsc_drv.c | 18 +++++++++++++++--- > > > 1 file changed, 15 insertions(+), 3 deletions(-) > > > > > > diff --git a/drivers/net/hyperv/netvsc_drv.c > > > b/drivers/net/hyperv/netvsc_drv.c index f3f9eb8..39c412f 100644 > > > --- a/drivers/net/hyperv/netvsc_drv.c > > > +++ b/drivers/net/hyperv/netvsc_drv.c > > > @@ -2267,10 +2267,14 @@ static int netvsc_probe(struct hv_device *dev, > > > struct net_device_context *net_device_ctx; > > > struct netvsc_device_info *device_info = NULL; > > > struct netvsc_device *nvdev; > > > + char name[IFNAMSIZ]; > > > int ret = -ENOMEM; > > > > > > - net = alloc_etherdev_mq(sizeof(struct net_device_context), > > > - VRSS_CHANNEL_MAX); > > > + snprintf(name, IFNAMSIZ, "eth%d", dev->channel->dev_num); > > > > How is this supposed to work when there are other ethernet device types on the > > system, which may claim the same device names? > > > > > + net = alloc_netdev_mqs(sizeof(struct net_device_context), name, > > > + NET_NAME_ENUM, ether_setup, > > > + VRSS_CHANNEL_MAX, VRSS_CHANNEL_MAX); > > > + > > > if (!net) > > > goto no_net; > > > > > > @@ -2355,6 +2359,14 @@ static int netvsc_probe(struct hv_device *dev, > > > net->max_mtu = ETH_DATA_LEN; > > > > > > ret = register_netdevice(net); > > > + > > > + if (ret == -EEXIST) { > > > + pr_info("NIC name %s exists, request another name.\n", > > > + net->name); > > > + strlcpy(net->name, "eth%d", IFNAMSIZ); > > > + ret = register_netdevice(net); > > > + } > > > > IOW you want the device naming to be predictable, but don't guarantee this? > > > > I think the problem this patchset is trying to solve is much better solved with a > > udev rule, similar to how it's done for PCI net devices. > > And IMO the primary channel number, being a device's "hardware" > > property, is more suited to be used in the device name, than this completely > > ephemeral device number. > > The vmbus number can be affected by other types of devices and/or subchannel > offerings. They are not stable either. That's why this patch set keeps track of the > offering sequence within the same device type in a new variable "dev_num". > > As in my earlier email, to avoid impact by other types of NICs, we should put them > into different naming formats, like "vf*", "enP*", etc. And yes, these can be done in > udev. > > But for netvsc (synthetic) NICs, we still want the default naming format "eth*". And > the variable "dev_num" gives them the basis for stable naming with Async probing. > > Thanks, > - Haiyang > The primary requirements for network naming are: 1. Network names must be repeatable on each boot. This was the original problem that PCI devices discovered back years ago when parallel probing was enabled. 2. Network names must be predictable. If new VM is created, the names should match a similar VM config. 3. Names must be persistent. If a NIC is added or deleted, the existing names must not change. The other things which are important (and this proposal breaks): 1. Don't break it principle: an existing VM should not suddenly get interfaces renamed if kernel is upgraded. A corrallary is that a lot of current userspace code expects eth0. It doesn't look like first interface would be guaranteed to be eth0. 2. No snowflakes principle: a device driver should follow the current practice of other devices. For netvsc, this means VMBus should act like PCI as much as possible. Is there another driver doing this already? 3. Userspace policy principle: Every distribution has its own policy by now. The solution must make netvsc work reliably on Redhat (udev), Ubuntu (netplan), SuSE (Yast) doing something in the kernel violates #2. My recommendation would be to take a multi-phase approach: 1. Expose persistent value in sysfs now. 2. Work with udev/netplan/... to use that value. 3. Make parallel VMBus probing an option. So that when distributions have picked up the udev changes they can enable parallel probe. Some will be quick to adopt and the enterprise laggards can get to it when they feel the heat. Long term wish list (requires host side changes): 1. The interface index could be a host side property; the host networking already has the virtual device table and it is persistent. 2. The Azure NIC name should be visible as a property in guest. Then userspace could do rename based on that property. Having multiple disconnected names is leads to confusion.