> -----Original Message----- > From: Chris Wright [mailto:chrisw@xxxxxxxxxx] > Sent: Wednesday, November 30, 2011 3:01 PM > To: Ben Hutchings > Cc: Chris Wright; Rose, Gregory V; Roopa Prabhu; netdev@xxxxxxxxxxxxxxx; > davem@xxxxxxxxxxxxx; sri@xxxxxxxxxx; dragos.tatulea@xxxxxxxxx; > kvm@xxxxxxxxxxxxxxx; arnd@xxxxxxxx; mst@xxxxxxxxxx; mchan@xxxxxxxxxxxx; > dwang2@xxxxxxxxx; shemminger@xxxxxxxxxx; eric.dumazet@xxxxxxxxx; > kaber@xxxxxxxxx; benve@xxxxxxxxx > Subject: Re: [net-next-2.6 PATCH 0/6 v4] macvlan: MAC Address filtering > support for passthru mode > > * Ben Hutchings (bhutchings@xxxxxxxxxxxxxx) wrote: > > On Wed, 2011-11-30 at 13:04 -0800, Chris Wright wrote: > > > I agree that it's confusing. Couldn't you simplify your ascii art > > > (hopefully removing hw assumptions about receive processing, and > > > completely ignoring vlans for the moment) to something like: > > > > > > |RX > > > v > > > +------------+-------------+ > > > | +------+--------+ | > > > | | RX MAC filter | | > > > | |and port select| | > > > | +---------------+ | > > > | /|\ | > > > | / | \ match 2| > > > | / v \ | > > > | /match \ | > > > | / 1 | \ | > > > | / | \ | > > > |match / | \ | > > > | 0 / | \ | > > > | v | v | > > > | | | | | > > > +----+--------+--------+---+ > > > | | | > > > PF VF 1 VF 2 > > > > > > And there's an unclear number of ways to update "RX MAC filter and > port > > > select" table. > > > > > > 1) PF ndo_set_mac_addr > > > I expect that to be implicit to match 0. > > > > > > 2) PF ndo_set_rx_mode > > > Less clear, but I'd still expect these to implicitly match 0 > > > > > > 3) PF ndo_set_vf_mac > > > I expect these to be an explicit match to VF N (given the interface > > > specifices which VF's MAC is being programmed). > > > > I'm not sure whether this is supposed to implicitly add to the MAC > > filter or whether that has to be changed too. That's the main > > difference between my models (a) and (b). > > I see now. I wasn't entirely clear on the difference before. It's also > going to be hw specific. I think (Intel folks can verify) that the > Intel SR-IOV devices have a single global unicast exact match table, > for example. > > > There's also PF ndo_set_vf_vlan. > > Right, although I had mentioned I was trying to limit just to MAC > filtering to simplify. > > > > 4) VF ndo_set_mac_addr > > > This one may or may not be allowed (setting MAC+port if the VF is > owned > > > by a guest is likely not allowed), but would expect an implicit VF N. > > > > > > 5) VF ndo_set_rx_mode > > > Same as 4) above. > > > > So this is where we are today. > > Cool, good that we agree there. > > > > 6) PF or VF? ndo_set_rx_filter_addr > > > The new proposal, which has an explicit VF, although when it's VF_SELF > > > I'm not clear if this is just the same as 5) above? > > > > > > Have I missed anything? > > > > Any physical port can be bridged to a mixture of guests with and without > > their own VFs. Packets sent from a guest with a VF to the address of a > > guest without a VF need to be forwarded to the PF rather than the > > physical port, but none of the drivers currently get to know about those > > addresses. > > To clarify, do you mean something like this? > > physical port > | > +------------+------------+ > | +-----+ | > | | VEB | | > | +-----+ | > | / | \ | > | / | \ | > | / | \ | > +-----+------+------+-----+ > | | | > PF VF 1 VF 2 > / | | > +---+---+ VM4 +---+---+ > | sw | |macvtap| > | switch| +---+---+ > +-+-+-+-+ | > / | \ VM5 > / | \ > VM1 VM2 VM3 > > This has VMs 1-3 hanging of the PF via a linux bridge (traditional hv > switching), VM4 directly owning VF1 (pci device assignement), and VM5 > indirectly owning VF2 (macvtap passthrough, that started this whole > thing). > > So, I'm understanding you saying that VM4 or VM4 sending a packet to VM1 > goes in to VEB, out PF, and into linux bridging code, rigth? At which > point the PF is in promiscuous mode (btw, same does not work if bridge is > attached to VF, at least for some VFs, due to lack of promiscuous mode). > > > Packets sent from a guest with a VF to the address of another guest with > > a VF need to be forwarded similarly, but the driver should be able to > > infer that from (3). > > Right, and that works currently for the case where both guests are like > VM4, they directly own the VF via PCI device assignement. But for VM4 > to talk to VM5, VF3 is not in promiscuous mode and has a different MAC > address than VM5's vNIC. If the embedded bridge does not learn, and > nobody programmed it to fwd frames for VM5 via VF3... > > I believe this is what Roopa's patch will allow. The question now is > whether there's a better way to handle this? > > In my mind, we'd model the NIC's embedded bridge as, well, a bridge. > And set anti-spoofing, port mirroring, port mac/vlan filtering, etc via > that bridge. If there was some way to push the bridge forwarding database down to the underlying HW so that the filters could be programmed into the HW for non-learning VEBs that would work too. This hole has existed for a very long time, years now. It'd be nice to get it fixed. If the community direction is to extend the current bridging interface then that's fine, we'll go that way. - Greg -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html