RE: [net-next-2.6 PATCH 0/6 v4] macvlan: MAC Address filtering support for passthru mode

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



> -----Original Message-----
> From: Chris Wright [mailto:chrisw@xxxxxxxxxx]
> Sent: Wednesday, November 30, 2011 3:01 PM
> To: Ben Hutchings
> Cc: Chris Wright; Rose, Gregory V; Roopa Prabhu; netdev@xxxxxxxxxxxxxxx;
> davem@xxxxxxxxxxxxx; sri@xxxxxxxxxx; dragos.tatulea@xxxxxxxxx;
> kvm@xxxxxxxxxxxxxxx; arnd@xxxxxxxx; mst@xxxxxxxxxx; mchan@xxxxxxxxxxxx;
> dwang2@xxxxxxxxx; shemminger@xxxxxxxxxx; eric.dumazet@xxxxxxxxx;
> kaber@xxxxxxxxx; benve@xxxxxxxxx
> Subject: Re: [net-next-2.6 PATCH 0/6 v4] macvlan: MAC Address filtering
> support for passthru mode
> 
> * Ben Hutchings (bhutchings@xxxxxxxxxxxxxx) wrote:
> > On Wed, 2011-11-30 at 13:04 -0800, Chris Wright wrote:
> > > I agree that it's confusing.  Couldn't you simplify your ascii art
> > > (hopefully removing hw assumptions about receive processing, and
> > > completely ignoring vlans for the moment) to something like:
> > >
> > >              |RX
> > >              v
> > > +------------+-------------+
> > > |     +------+--------+    |
> > > |     | RX MAC filter |    |
> > > |     |and port select|    |
> > > |     +---------------+    |
> > > |            /|\           |
> > > |           / | \   match 2|
> > > |          /  v  \         |
> > > |         /match  \        |
> > > |        /  1 |    \       |
> > > |       /     |     \      |
> > > |match /      |      \     |
> > > |  0  /       |       \    |
> > > |    v        |        v   |
> > > |    |        |        |   |
> > > +----+--------+--------+---+
> > >      |        |        |
> > >     PF       VF 1     VF 2
> > >
> > > And there's an unclear number of ways to update "RX MAC filter and
> port
> > > select" table.
> > >
> > > 1) PF ndo_set_mac_addr
> > > I expect that to be implicit to match 0.
> > >
> > > 2) PF ndo_set_rx_mode
> > > Less clear, but I'd still expect these to implicitly match 0
> > >
> > > 3) PF ndo_set_vf_mac
> > > I expect these to be an explicit match to VF N (given the interface
> > > specifices which VF's MAC is being programmed).
> >
> > I'm not sure whether this is supposed to implicitly add to the MAC
> > filter or whether that has to be changed too.  That's the main
> > difference between my models (a) and (b).
> 
> I see now.  I wasn't entirely clear on the difference before.  It's also
> going to be hw specific.  I think (Intel folks can verify) that the
> Intel SR-IOV devices have a single global unicast exact match table,
> for example.
> 
> > There's also PF ndo_set_vf_vlan.
> 
> Right, although I had mentioned I was trying to limit just to MAC
> filtering to simplify.
> 
> > > 4) VF ndo_set_mac_addr
> > > This one may or may not be allowed (setting MAC+port if the VF is
> owned
> > > by a guest is likely not allowed), but would expect an implicit VF N.
> > >
> > > 5) VF ndo_set_rx_mode
> > > Same as 4) above.
> >
> > So this is where we are today.
> 
> Cool, good that we agree there.
> 
> > > 6) PF or VF? ndo_set_rx_filter_addr
> > > The new proposal, which has an explicit VF, although when it's VF_SELF
> > > I'm not clear if this is just the same as 5) above?
> > >
> > > Have I missed anything?
> >
> > Any physical port can be bridged to a mixture of guests with and without
> > their own VFs.  Packets sent from a guest with a VF to the address of a
> > guest without a VF need to be forwarded to the PF rather than the
> > physical port, but none of the drivers currently get to know about those
> > addresses.
> 
> To clarify, do you mean something like this?
> 
>        physical port
>              |
> +------------+------------+
> |         +-----+         |
> |         | VEB |         |
> |         +-----+         |
> |        /   |   \        |
> |       /    |    \       |
> |      /     |     \      |
> +-----+------+------+-----+
>       |      |       |
>      PF    VF 1    VF 2
>      /       |       |
>  +---+---+  VM4  +---+---+
>  |  sw   |       |macvtap|
>  | switch|       +---+---+
>  +-+-+-+-+           |
>    / | \            VM5
>   /  |  \
> VM1 VM2 VM3
> 
> This has VMs 1-3 hanging of the PF via a linux bridge (traditional hv
> switching), VM4 directly owning VF1 (pci device assignement), and VM5
> indirectly owning VF2 (macvtap passthrough, that started this whole
> thing).
> 
> So, I'm understanding you saying that VM4 or VM4 sending a packet to VM1
> goes in to VEB, out PF, and into linux bridging code, rigth?  At which
> point the PF is in promiscuous mode (btw, same does not work if bridge is
> attached to VF, at least for some VFs, due to lack of promiscuous mode).
> 
> > Packets sent from a guest with a VF to the address of another guest with
> > a VF need to be forwarded similarly, but the driver should be able to
> > infer that from (3).
> 
> Right, and that works currently for the case where both guests are like
> VM4, they directly own the VF via PCI device assignement.  But for VM4
> to talk to VM5, VF3 is not in promiscuous mode and has a different MAC
> address than VM5's vNIC.  If the embedded bridge does not learn, and
> nobody programmed it to fwd frames for VM5 via VF3...
> 
> I believe this is what Roopa's patch will allow.  The question now is
> whether there's a better way to handle this?
> 
> In my mind, we'd model the NIC's embedded bridge as, well, a bridge.
> And set anti-spoofing, port mirroring, port mac/vlan filtering, etc via
> that bridge.

If there was some way to push the bridge forwarding database down to the
underlying HW so that the filters could be programmed into the HW for
non-learning VEBs that would work too.

This hole has existed for a very long time, years now.  It'd be nice to get
it fixed.  If the community direction is to extend the current bridging
interface then that's fine, we'll go that way.

- Greg

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]
  Powered by Linux