On 4/26/2016 10:58 AM, Jason Gunthorpe wrote: > On Tue, Apr 26, 2016 at 10:19:37AM -0400, Doug Ledford wrote: >> For certain operations that have lots of optional items (work requests >> for one, work completions for another) > > FWIW, I think we had a general consensus to take a different approach. > > Basically, the 'common' uAPI does not care about micro-performance. > > Drivers have to implement hardware-specific driver calls to micro-optimize > their own high speed paths, and that would be done specifically with a > single hardware in mind. > > This is already done by the majority of drivers for wc/wr processing > (IIRC, only qib calls to the kernel for this) > > If we do provide a common wr/wc API then it can just be designed > inefficiently around the netlink attribute architecture, uncaring > about performance because nothing should use it. I'd prefer not to > implement it at all... We're talking about two different things. I had the actual user space API on my mind when I wrote what I wrote (aka, libibverbs). If we are going to talk about the verbs 2.0 kernel interface, then it makes sense to me to keep the user space API firmly in mind too. Although it would be great if the user space verbs never changed a bit, that isn't entirely possible. The timestamp changes that are still waiting are an example. Currently, I'm not real happy with how the extension mechanism in libibverbs has played out. The intent was good, the reality is clunky IMO. > This same basic idea flows over to other parts, eg if a driver has > special support for a specific work load (say fast creation of IB UD > AHs) then it can have a high speed driver-specific call to do that > work completely micro-optimized using data formed *exactly* the way > the hardware needs. > >> base struct plus the length of all optional structs, and the order of >> the optional structs matches their bit order from lowest to highest in >> the magic element. It's not quite as free form as the patches for >> timestamp support were, but still allows the structs some flexibility in >> what is included and what isn't. > > Mellanox has a patch series that tries to do exactly this for the wc > in libibverbs - it is quite ugly, and the the benchmarks showed worse > performance compared to the current technique. Right, but it completely rewrote the struct from scratch for each WC. That's different than what I posted, which was more along the lines of functional groupings. It reduces the number of conditionals while still reducing the overall struct size for anything other than "we have every option turned on" case. There are only a few options for how to expose these things to user space (using the example I gave as a further talking point): 1) Grow struct ibv_wc for every new option. This totally prevents any ability to either remove items or reorder items in this struct. It also bloats the size of the user visible struct for the common case. 2) Do away with direct variable access and go to indirect variable access via accessor functions. This will work, and has the advantage that accessor functions can work directly with the hardware specific structs, thereby eliminating a copy from the hardware struct to the user visible API struct. It has the disadvantage that every item you wish to access will require an indirect function call from a table. 3) Do like the Mellanox patches did and completely rewrite the completion struct for each completion. Changing ordering and everything else based on what's there. As you pointed out, this had performance issues. 4) What I wrote, which was intended to be a compromise between #1 and #3 to hopefully help with performance issues. > For the reasons above I would prefer to stick entirely with the > netlink attribute format or very similar as the main mechanism. I don't intend to expose anything netlink to libibverbs users ;-) Like I said, we're talking about different things.
Attachment:
signature.asc
Description: OpenPGP digital signature