On Thu, Mar 21, 2019 at 03:04:37PM +0200, Liran Alon wrote: > > > > On 21 Mar 2019, at 14:57, Michael S. Tsirkin <mst@xxxxxxxxxx> wrote: > > > > On Thu, Mar 21, 2019 at 02:47:50PM +0200, Liran Alon wrote: > >> > >> > >>> On 21 Mar 2019, at 14:37, Michael S. Tsirkin <mst@xxxxxxxxxx> wrote: > >>> > >>> On Thu, Mar 21, 2019 at 12:07:57PM +0200, Liran Alon wrote: > >>>>>>>> 2) It brings non-intuitive customer experience. For example, a customer may attempt to analyse connectivity issue by checking the connectivity > >>>>>>>> on a net-failover slave (e.g. the VF) but will see no connectivity when in-fact checking the connectivity on the net-failover master netdev shows correct connectivity. > >>>>>>>> > >>>>>>>> The set of changes I vision to fix our issues are: > >>>>>>>> 1) Hide net-failover slaves in a different netns created and managed by the kernel. But that user can enter to it and manage the netdevs there if wishes to do so explicitly. > >>>>>>>> (E.g. Configure the net-failover VF slave in some special way). > >>>>>>>> 2) Match the virtio-net and the VF based on a PV attribute instead of MAC. (Similar to as done in NetVSC). E.g. Provide a virtio-net interface to get PCI slot where the matching VF will be hot-plugged by hypervisor. > >>>>>>>> 3) Have an explicit virtio-net control message to command hypervisor to switch data-path from virtio-net to VF and vice-versa. Instead of relying on intercepting the PCI master enable-bit > >>>>>>>> as an indicator on when VF is about to be set up. (Similar to as done in NetVSC). > >>>>>>>> > >>>>>>>> Is there any clear issue we see regarding the above suggestion? > >>>>>>>> > >>>>>>>> -Liran > >>>>>>> > >>>>>>> The issue would be this: how do we avoid conflicting with namespaces > >>>>>>> created by users? > >>>>>> > >>>>>> This is kinda controversial, but maybe separate netns names into 2 groups: hidden and normal. > >>>>>> To reference a hidden netns, you need to do it explicitly. > >>>>>> Hidden and normal netns names can collide as they will be maintained in different namespaces (Yes I’m overloading the term namespace here…). > >>>>> > >>>>> Maybe it's an unnamed namespace. Hidden until userspace gives it a name? > >>>> > >>>> This is also a good idea that will solve the issue. Yes. > >>>> > >>>>> > >>>>>> Does this seems reasonable? > >>>>>> > >>>>>> -Liran > >>>>> > >>>>> Reasonable I'd say yes, easy to implement probably no. But maybe I > >>>>> missed a trick or two. > >>>> > >>>> BTW, from a practical point of view, I think that even until we figure out a solution on how to implement this, > >>>> it was better to create an kernel auto-generated name (e.g. “kernel_net_failover_slaves") > >>>> that will break only userspace workloads that by a very rare-chance have a netns that collides with this then > >>>> the breakage we have today for the various userspace components. > >>>> > >>>> -Liran > >>> > >>> It seems quite easy to supply that as a module parameter. Do we need two > >>> namespaces though? Won't some userspace still be confused by the two > >>> slaves sharing the MAC address? > >> > >> That’s one reasonable option. > >> Another one is that we will indeed change the mechanism by which we determine a VF should be bonded with a virtio-net device. > >> i.e. Expose a new virtio-net property that specify the PCI slot of the VF to be bonded with. > >> > >> The second seems cleaner but I don’t have a strong opinion on this. Both seem reasonable to me and your suggestion is faster to implement from current state of things. > >> > >> -Liran > > > > OK. Now what happens if master is moved to another namespace? Do we need > > to move the slaves too? > > No. Why would we move the slaves? The reason we have 3 device model at all is so users can fine tune the slaves. I don't see why this applies to the root namespace but not a container. If it has access to failover it should have access to slaves. > The whole point is to make most customer ignore the net-failover slaves and remain them “hidden” in their dedicated netns. So that makes the common case easy. That is good. My worry is it might make some uncommon cases impossible. > We won’t prevent customer from explicitly moving the net-failover slaves out of this netns, but we will not move them out of there automatically. > > > > > Also siwei's patch is then kind of extraneous right? > > Attempts to rename a slave will now fail as it's in a namespace… > > I’m not sure actually. Isn't udev/systemd netns-aware? > I would expect it to be able to provide names also to netdevs in netns different than default netns. I think most people move devices after they are renamed. > If that’s the case, Si-Wei patch to be able to rename a net-failover slave when it is already open is still required. As the race-condition still exists. > > -Liran > > > > >>> > >>> -- > >>> MST _______________________________________________ Virtualization mailing list Virtualization@xxxxxxxxxxxxxxxxxxxxxxxxxx https://lists.linuxfoundation.org/mailman/listinfo/virtualization