Re: [PATCH RFC 0/3] Support standard SRIOV configuration for IB VFs

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, 2015-05-21 at 22:55 +0300, Or Gerlitz wrote:
> On Thu, May 21, 2015 at 7:40 PM, Doug Ledford <dledford@xxxxxxxxxx> wrote:
> 
> > The MAC/GUID mapping isn't the only thing that has to be faked here
> 
> Exactly nothing is faked here, Virtualization systems such as
> open-stack provision unique 48 bit mac values to VMs, and it's
> perfectly legitimate and viable to derive 64 bit guid value from that
> mac.

OK, faked wasn't the best use of words.  How's converted behind the
software's back?  And if the management software set the MAC, then tried
to check it via ARP after the guest is up and running, it would never
find the guest.  I don't know if Open-Stack or any other controller
would both A) attempt to set the MAC of the device in libvirt and start
the guest and B) enter the MAC into a dhcp.conf file for static IP
assignment, but they could, and this sort of manipulation would directly
break that.

> > Why are we suggesting to make this work with unmodified software?  Why
> > aren't we doing this right and adding a new ndo entry point for the GUID?
> 
> Because rome wasn't built in a day and nor will be the support for IB
> in today's/tomorrow's virtualization systems, e.g if you follow on
> this layering
> 
> [1] Open-Stack / ODL controller
> [2] Open-Stack neutron / ODL agent
> [3] libvirt
> [4] user/kernel netlink API
> [5] kernel ndo API
> [6] ipoib
> [7] kernel verbs API
> [8] PF IB driver
> 
> with the approach presented here,  we only simply (yeah, simplicity
> could turn to be critical criteria in engineering) to few kernel only
> patches that deal with layers 6-8 and we are ready for all sorts of
> bring-ups, testing and even production!

You're ready to pretend that your IB device is a regular ethernet
device.  Not even a RoCE or iWARP device.  You totally obscured that the
device is RDMA capable and made the entire stack above 6 unawares of
what you are doing.  If your guest actually intends to use any RDMA
capabilities, then this is, at best, a quick and dirty workaround to get
you up and running while you work through the process of doing things
right, which includes making libvirt aware of the difference between
RDMA capable devices and not and how to select those devices and how to
mark certain guests as RDMA device needy.

> For reasons which I don't really see the practical / real life use
> case where there's a must to get them to work (but I will happy to
> hear on) one can go & change the world, namely patch layers 5 ---> 1
> too and deal with all sort of dependencies for setting up a system.
> But guess what, this can be perfectly done in parallel with this small
> change.
> 
> > you would also have to fake the vlan/pkey mapping.  This just
> > seems the wrong thing to do.
> 
> Repeating the above argument --- virt systems provision 12bit vlan-id
> to be set for VM traffic, which can be nicely map to 16 bit IB pkey
> doing the same job.
> 
> I understand that you have sort of  desire to see IB ala the full spec
> going into libvirt and from there up to the whole virtualization
> management space, but this doesn't need as an argument to not enable
> doing thing in the right direction. The upstream kernel supports SRIOV
> for IB over mlx4 for 3 years now, but this can't work with libvirt as
> is. Using these patches can make the thing.

It's a workaround.  It comes with limitations, and if we get around to
adding an ndo later for really setting the guid, then it would be
possible to call the set_guid ndo with a complete guid that didn't use
fffe in the middle 2 bytes, and then when we call get vf_info, we get a
MAC back that removes those 2 bytes and generates an inconsistency
between what we *think* our constructed guid should be and what the set
guid actually is.

> Couple of months ago, we both attended a call with the libvirt
> developers / maintainers from red-hat and they really liked this
> staged approach.

My recollection of that call was they said "Oh, you guys don't have an
API for us to set the GUIDs yet.  Ok, we'll close all the bugs and wait
until you do."  And they promptly closed the bugs and moved on.  But
that didn't specify the API to use.  That's what we are doing here.  But
I'm not finding this an entirely convincing solution.

-- 
Doug Ledford <dledford@xxxxxxxxxx>
              GPG KeyID: 0E572FDD

Attachment: signature.asc
Description: This is a digitally signed message part


[Index of Archives]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Photo]     [Yosemite News]     [Yosemite Photos]     [Linux Kernel]     [Linux SCSI]     [XFree86]
  Powered by Linux