RE: [PATCH V2,net-next, 3/3] hv_netvsc: Name NICs based on vmbus offer sequence and use async probe

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




> -----Original Message-----
> From: linux-hyperv-owner@xxxxxxxxxxxxxxx <linux-hyperv-
> owner@xxxxxxxxxxxxxxx> On Behalf Of Stephen Hemminger
> Sent: Tuesday, December 31, 2019 12:36 PM
> To: Haiyang Zhang <haiyangz@xxxxxxxxxxxxx>
> Cc: Roman Kagan <rkagan@xxxxxxxxxxxxx>; sashal@xxxxxxxxxx; linux-
> hyperv@xxxxxxxxxxxxxxx; netdev@xxxxxxxxxxxxxxx; KY Srinivasan
> <kys@xxxxxxxxxxxxx>; Stephen Hemminger <sthemmin@xxxxxxxxxxxxx>;
> olaf@xxxxxxxxx; vkuznets <vkuznets@xxxxxxxxxx>; davem@xxxxxxxxxxxxx
> Subject: Re: [PATCH V2,net-next, 3/3] hv_netvsc: Name NICs based on vmbus
> offer sequence and use async probe
> 
> On Tue, 31 Dec 2019 16:12:36 +0000
> Haiyang Zhang <haiyangz@xxxxxxxxxxxxx> wrote:
> 
> > > -----Original Message-----
> > > From: Roman Kagan <rkagan@xxxxxxxxxxxxx>
> > > Sent: Tuesday, December 31, 2019 6:35 AM
> > > To: Haiyang Zhang <haiyangz@xxxxxxxxxxxxx>
> > > Cc: sashal@xxxxxxxxxx; linux-hyperv@xxxxxxxxxxxxxxx;
> netdev@xxxxxxxxxxxxxxx;
> > > KY Srinivasan <kys@xxxxxxxxxxxxx>; Stephen Hemminger
> > > <sthemmin@xxxxxxxxxxxxx>; olaf@xxxxxxxxx; vkuznets
> > > <vkuznets@xxxxxxxxxx>; davem@xxxxxxxxxxxxx; linux-
> kernel@xxxxxxxxxxxxxxx
> > > Subject: Re: [PATCH V2,net-next, 3/3] hv_netvsc: Name NICs based on
> vmbus
> > > offer sequence and use async probe
> > >
> > > On Mon, Dec 30, 2019 at 12:13:34PM -0800, Haiyang Zhang wrote:
> > > > The dev_num field in vmbus channel structure is assigned to the first
> > > > available number when the channel is offered. So netvsc driver uses it
> > > > for NIC naming based on channel offer sequence. Now re-enable the
> > > > async probing mode for faster probing.
> > > >
> > > > Signed-off-by: Haiyang Zhang <haiyangz@xxxxxxxxxxxxx>
> > > > ---
> > > >  drivers/net/hyperv/netvsc_drv.c | 18 +++++++++++++++---
> > > >  1 file changed, 15 insertions(+), 3 deletions(-)
> > > >
> > > > diff --git a/drivers/net/hyperv/netvsc_drv.c
> > > > b/drivers/net/hyperv/netvsc_drv.c index f3f9eb8..39c412f 100644
> > > > --- a/drivers/net/hyperv/netvsc_drv.c
> > > > +++ b/drivers/net/hyperv/netvsc_drv.c
> > > > @@ -2267,10 +2267,14 @@ static int netvsc_probe(struct hv_device *dev,
> > > >  	struct net_device_context *net_device_ctx;
> > > >  	struct netvsc_device_info *device_info = NULL;
> > > >  	struct netvsc_device *nvdev;
> > > > +	char name[IFNAMSIZ];
> > > >  	int ret = -ENOMEM;
> > > >
> > > > -	net = alloc_etherdev_mq(sizeof(struct net_device_context),
> > > > -				VRSS_CHANNEL_MAX);
> > > > +	snprintf(name, IFNAMSIZ, "eth%d", dev->channel->dev_num);
> > >
> > > How is this supposed to work when there are other ethernet device types on
> the
> > > system, which may claim the same device names?
> > >
> > > > +	net = alloc_netdev_mqs(sizeof(struct net_device_context), name,
> > > > +			       NET_NAME_ENUM, ether_setup,
> > > > +			       VRSS_CHANNEL_MAX, VRSS_CHANNEL_MAX);
> > > > +
> > > >  	if (!net)
> > > >  		goto no_net;
> > > >
> > > > @@ -2355,6 +2359,14 @@ static int netvsc_probe(struct hv_device *dev,
> > > >  		net->max_mtu = ETH_DATA_LEN;
> > > >
> > > >  	ret = register_netdevice(net);
> > > > +
> > > > +	if (ret == -EEXIST) {
> > > > +		pr_info("NIC name %s exists, request another name.\n",
> > > > +			net->name);
> > > > +		strlcpy(net->name, "eth%d", IFNAMSIZ);
> > > > +		ret = register_netdevice(net);
> > > > +	}
> > >
> > > IOW you want the device naming to be predictable, but don't guarantee this?
> > >
> > > I think the problem this patchset is trying to solve is much better solved with
> a
> > > udev rule, similar to how it's done for PCI net devices.
> > > And IMO the primary channel number, being a device's "hardware"
> > > property, is more suited to be used in the device name, than this completely
> > > ephemeral device number.
> >
> > The vmbus number can be affected by other types of devices and/or
> subchannel
> > offerings. They are not stable either. That's why this patch set keeps track of
> the
> > offering sequence within the same device type in a new variable "dev_num".
> >
> > As in my earlier email, to avoid impact by other types of NICs, we should put
> them
> > into different naming formats, like "vf*", "enP*", etc. And yes, these can be
> done in
> > udev.
> >
> > But for netvsc (synthetic) NICs, we still want the default naming format "eth*".
> And
> > the variable "dev_num" gives them the basis for stable naming with Async
> probing.
> >
> > Thanks,
> > - Haiyang
> >
> 
> The primary requirements for network naming are:
>   1. Network names must be repeatable on each boot. This was the original
> problem
>      that PCI devices discovered back years ago when parallel probing was
> enabled.
>   2. Network names must be predictable. If new VM is created, the names
> should
>      match a similar VM config.
>   3. Names must be persistent. If a NIC is added or deleted, the existing names
>      must not change.
> 
> The other things which are important (and this proposal breaks):
>   1. Don't break it principle: an existing VM should not suddenly get interfaces
>      renamed if kernel is upgraded. A corrallary is that a lot of current userspace
>      code expects eth0. It doesn't look like first interface would be guaranteed
>      to be eth0.
> 
>   2. No snowflakes principle: a device driver should follow the current practice
>      of other devices. For netvsc, this means VMBus should act like PCI as much
>      as possible. Is there another driver doing this already?
> 
>   3. Userspace policy principle: Every distribution has its own policy by now.
>      The solution must make netvsc work reliably on Redhat (udev), Ubuntu
> (netplan), SuSE (Yast)
>      doing something in the kernel violates #2.
> 
> My recommendation would be to take a multi-phase approach:
>   1. Expose persistent value in sysfs now.
>   2. Work with udev/netplan/... to use that value.
>   3. Make parallel VMBus probing an option. So that when distributions have
> picked up
>      the udev changes they can enable parallel probe. Some will be quick to
> adopt
>      and the enterprise laggards can get to it when they feel the heat.
> 
> Long term wish list (requires host side changes):
>    1. The interface index could be a host side property; the host networking
>        already has the virtual device table and it is persistent.
>    2. The Azure NIC name should be visible as a property in guest.
>       Then userspace could do rename based on that property.
>       Having multiple disconnected names is leads to confusion.

Thank you for the detailed recommendations!
I will do the step 1. Expose persistent value in sysfs now.
And work on other changes in the future.

Thanks,
- Haiyang




[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux