> -----Original Message----- > From: linux-hyperv-owner@xxxxxxxxxxxxxxx <linux-hyperv- > owner@xxxxxxxxxxxxxxx> On Behalf Of Stephen Hemminger > Sent: Tuesday, December 31, 2019 12:36 PM > To: Haiyang Zhang <haiyangz@xxxxxxxxxxxxx> > Cc: Roman Kagan <rkagan@xxxxxxxxxxxxx>; sashal@xxxxxxxxxx; linux- > hyperv@xxxxxxxxxxxxxxx; netdev@xxxxxxxxxxxxxxx; KY Srinivasan > <kys@xxxxxxxxxxxxx>; Stephen Hemminger <sthemmin@xxxxxxxxxxxxx>; > olaf@xxxxxxxxx; vkuznets <vkuznets@xxxxxxxxxx>; davem@xxxxxxxxxxxxx > Subject: Re: [PATCH V2,net-next, 3/3] hv_netvsc: Name NICs based on vmbus > offer sequence and use async probe > > On Tue, 31 Dec 2019 16:12:36 +0000 > Haiyang Zhang <haiyangz@xxxxxxxxxxxxx> wrote: > > > > -----Original Message----- > > > From: Roman Kagan <rkagan@xxxxxxxxxxxxx> > > > Sent: Tuesday, December 31, 2019 6:35 AM > > > To: Haiyang Zhang <haiyangz@xxxxxxxxxxxxx> > > > Cc: sashal@xxxxxxxxxx; linux-hyperv@xxxxxxxxxxxxxxx; > netdev@xxxxxxxxxxxxxxx; > > > KY Srinivasan <kys@xxxxxxxxxxxxx>; Stephen Hemminger > > > <sthemmin@xxxxxxxxxxxxx>; olaf@xxxxxxxxx; vkuznets > > > <vkuznets@xxxxxxxxxx>; davem@xxxxxxxxxxxxx; linux- > kernel@xxxxxxxxxxxxxxx > > > Subject: Re: [PATCH V2,net-next, 3/3] hv_netvsc: Name NICs based on > vmbus > > > offer sequence and use async probe > > > > > > On Mon, Dec 30, 2019 at 12:13:34PM -0800, Haiyang Zhang wrote: > > > > The dev_num field in vmbus channel structure is assigned to the first > > > > available number when the channel is offered. So netvsc driver uses it > > > > for NIC naming based on channel offer sequence. Now re-enable the > > > > async probing mode for faster probing. > > > > > > > > Signed-off-by: Haiyang Zhang <haiyangz@xxxxxxxxxxxxx> > > > > --- > > > > drivers/net/hyperv/netvsc_drv.c | 18 +++++++++++++++--- > > > > 1 file changed, 15 insertions(+), 3 deletions(-) > > > > > > > > diff --git a/drivers/net/hyperv/netvsc_drv.c > > > > b/drivers/net/hyperv/netvsc_drv.c index f3f9eb8..39c412f 100644 > > > > --- a/drivers/net/hyperv/netvsc_drv.c > > > > +++ b/drivers/net/hyperv/netvsc_drv.c > > > > @@ -2267,10 +2267,14 @@ static int netvsc_probe(struct hv_device *dev, > > > > struct net_device_context *net_device_ctx; > > > > struct netvsc_device_info *device_info = NULL; > > > > struct netvsc_device *nvdev; > > > > + char name[IFNAMSIZ]; > > > > int ret = -ENOMEM; > > > > > > > > - net = alloc_etherdev_mq(sizeof(struct net_device_context), > > > > - VRSS_CHANNEL_MAX); > > > > + snprintf(name, IFNAMSIZ, "eth%d", dev->channel->dev_num); > > > > > > How is this supposed to work when there are other ethernet device types on > the > > > system, which may claim the same device names? > > > > > > > + net = alloc_netdev_mqs(sizeof(struct net_device_context), name, > > > > + NET_NAME_ENUM, ether_setup, > > > > + VRSS_CHANNEL_MAX, VRSS_CHANNEL_MAX); > > > > + > > > > if (!net) > > > > goto no_net; > > > > > > > > @@ -2355,6 +2359,14 @@ static int netvsc_probe(struct hv_device *dev, > > > > net->max_mtu = ETH_DATA_LEN; > > > > > > > > ret = register_netdevice(net); > > > > + > > > > + if (ret == -EEXIST) { > > > > + pr_info("NIC name %s exists, request another name.\n", > > > > + net->name); > > > > + strlcpy(net->name, "eth%d", IFNAMSIZ); > > > > + ret = register_netdevice(net); > > > > + } > > > > > > IOW you want the device naming to be predictable, but don't guarantee this? > > > > > > I think the problem this patchset is trying to solve is much better solved with > a > > > udev rule, similar to how it's done for PCI net devices. > > > And IMO the primary channel number, being a device's "hardware" > > > property, is more suited to be used in the device name, than this completely > > > ephemeral device number. > > > > The vmbus number can be affected by other types of devices and/or > subchannel > > offerings. They are not stable either. That's why this patch set keeps track of > the > > offering sequence within the same device type in a new variable "dev_num". > > > > As in my earlier email, to avoid impact by other types of NICs, we should put > them > > into different naming formats, like "vf*", "enP*", etc. And yes, these can be > done in > > udev. > > > > But for netvsc (synthetic) NICs, we still want the default naming format "eth*". > And > > the variable "dev_num" gives them the basis for stable naming with Async > probing. > > > > Thanks, > > - Haiyang > > > > The primary requirements for network naming are: > 1. Network names must be repeatable on each boot. This was the original > problem > that PCI devices discovered back years ago when parallel probing was > enabled. > 2. Network names must be predictable. If new VM is created, the names > should > match a similar VM config. > 3. Names must be persistent. If a NIC is added or deleted, the existing names > must not change. > > The other things which are important (and this proposal breaks): > 1. Don't break it principle: an existing VM should not suddenly get interfaces > renamed if kernel is upgraded. A corrallary is that a lot of current userspace > code expects eth0. It doesn't look like first interface would be guaranteed > to be eth0. > > 2. No snowflakes principle: a device driver should follow the current practice > of other devices. For netvsc, this means VMBus should act like PCI as much > as possible. Is there another driver doing this already? > > 3. Userspace policy principle: Every distribution has its own policy by now. > The solution must make netvsc work reliably on Redhat (udev), Ubuntu > (netplan), SuSE (Yast) > doing something in the kernel violates #2. > > My recommendation would be to take a multi-phase approach: > 1. Expose persistent value in sysfs now. > 2. Work with udev/netplan/... to use that value. > 3. Make parallel VMBus probing an option. So that when distributions have > picked up > the udev changes they can enable parallel probe. Some will be quick to > adopt > and the enterprise laggards can get to it when they feel the heat. > > Long term wish list (requires host side changes): > 1. The interface index could be a host side property; the host networking > already has the virtual device table and it is persistent. > 2. The Azure NIC name should be visible as a property in guest. > Then userspace could do rename based on that property. > Having multiple disconnected names is leads to confusion. Thank you for the detailed recommendations! I will do the step 1. Expose persistent value in sysfs now. And work on other changes in the future. Thanks, - Haiyang