* Ben Hutchings (bhutchings@xxxxxxxxxxxxxx) wrote: > On Wed, 2011-11-30 at 13:04 -0800, Chris Wright wrote: > > I agree that it's confusing. Couldn't you simplify your ascii art > > (hopefully removing hw assumptions about receive processing, and > > completely ignoring vlans for the moment) to something like: > > > > |RX > > v > > +------------+-------------+ > > | +------+--------+ | > > | | RX MAC filter | | > > | |and port select| | > > | +---------------+ | > > | /|\ | > > | / | \ match 2| > > | / v \ | > > | /match \ | > > | / 1 | \ | > > | / | \ | > > |match / | \ | > > | 0 / | \ | > > | v | v | > > | | | | | > > +----+--------+--------+---+ > > | | | > > PF VF 1 VF 2 > > > > And there's an unclear number of ways to update "RX MAC filter and port > > select" table. > > > > 1) PF ndo_set_mac_addr > > I expect that to be implicit to match 0. > > > > 2) PF ndo_set_rx_mode > > Less clear, but I'd still expect these to implicitly match 0 > > > > 3) PF ndo_set_vf_mac > > I expect these to be an explicit match to VF N (given the interface > > specifices which VF's MAC is being programmed). > > I'm not sure whether this is supposed to implicitly add to the MAC > filter or whether that has to be changed too. That's the main > difference between my models (a) and (b). I see now. I wasn't entirely clear on the difference before. It's also going to be hw specific. I think (Intel folks can verify) that the Intel SR-IOV devices have a single global unicast exact match table, for example. > There's also PF ndo_set_vf_vlan. Right, although I had mentioned I was trying to limit just to MAC filtering to simplify. > > 4) VF ndo_set_mac_addr > > This one may or may not be allowed (setting MAC+port if the VF is owned > > by a guest is likely not allowed), but would expect an implicit VF N. > > > > 5) VF ndo_set_rx_mode > > Same as 4) above. > > So this is where we are today. Cool, good that we agree there. > > 6) PF or VF? ndo_set_rx_filter_addr > > The new proposal, which has an explicit VF, although when it's VF_SELF > > I'm not clear if this is just the same as 5) above? > > > > Have I missed anything? > > Any physical port can be bridged to a mixture of guests with and without > their own VFs. Packets sent from a guest with a VF to the address of a > guest without a VF need to be forwarded to the PF rather than the > physical port, but none of the drivers currently get to know about those > addresses. To clarify, do you mean something like this? physical port | +------------+------------+ | +-----+ | | | VEB | | | +-----+ | | / | \ | | / | \ | | / | \ | +-----+------+------+-----+ | | | PF VF 1 VF 2 / | | +---+---+ VM4 +---+---+ | sw | |macvtap| | switch| +---+---+ +-+-+-+-+ | / | \ VM5 / | \ VM1 VM2 VM3 This has VMs 1-3 hanging of the PF via a linux bridge (traditional hv switching), VM4 directly owning VF1 (pci device assignement), and VM5 indirectly owning VF2 (macvtap passthrough, that started this whole thing). So, I'm understanding you saying that VM4 or VM4 sending a packet to VM1 goes in to VEB, out PF, and into linux bridging code, rigth? At which point the PF is in promiscuous mode (btw, same does not work if bridge is attached to VF, at least for some VFs, due to lack of promiscuous mode). > Packets sent from a guest with a VF to the address of another guest with > a VF need to be forwarded similarly, but the driver should be able to > infer that from (3). Right, and that works currently for the case where both guests are like VM4, they directly own the VF via PCI device assignement. But for VM4 to talk to VM5, VF3 is not in promiscuous mode and has a different MAC address than VM5's vNIC. If the embedded bridge does not learn, and nobody programmed it to fwd frames for VM5 via VF3... I believe this is what Roopa's patch will allow. The question now is whether there's a better way to handle this? In my mind, we'd model the NIC's embedded bridge as, well, a bridge. And set anti-spoofing, port mirroring, port mac/vlan filtering, etc via that bridge. thanks, -chris -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html