Re: [PATCH V1 libibverbs 1/8] Add ibv_poll_cq_ex verb

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 01/03/2016 19:24, Jason Gunthorpe wrote:
On Tue, Mar 01, 2016 at 10:52:33AM +0200, Matan Barak (External) wrote:
By my count it scores about a 2 on the Rusty API scale, which is
pretty bad.

This is like writing something that means nothing....

If you don't agree with the points on the Rusty scale as what
constitutes good API design then we really have no common ground on
this point at all.


We can discuss actual points. That's how things can really progress.

This series trade cache line efficiency for computation efficiency,
and it isn't at all clear to me that is going to be a win. Better
include some good benchmarks.


WRONG. Which computational overhead are you talking about?

I don't see the driver side posted, but obviously there must be
indexing maths and possibly branching, depending on the design, that
is overhead.


libmlx4 patches were posted on the mailing list awhile ago [1].
One of the patches there optimized ibv_poll_cq_ex for common cases with almost zero overhead. Anyway, as I wrote before - this could be delegated to a future hardware so I'm not sure it's relevant here.

The user-space driver writes one completion at a time, which (depending on
the user configuration) is probably less than a cache line. Why do you think
it's worse than how ibv_poll_cq behaves? The way the user-space driver
optimizes/dirties the cache line is unrelated to the API.

I didn't say it was worse, I questioned that this should be the
overarching goal when no data to support this has been posted and it
doesn't even make inutitive sense that worrying about the size of
*dirty cache lines* is even very important, (well, within limits).


We ran ib_send_bw (perftest package) when developing this - one time using the regular ibv_poll_cq and another time with ibv_poll_cq_ex and got identical results (the difference was tiny and not conclusive), but I don't have the actual results right now. This was one of the common cases the user-space driver optimized. We could re-run them and post results if you want.

Christoph may have some data, but I do wonder if his results are
polluted by the current ibv_poll_cq driver implementations which do
trigger RMW overheads.

Function per every completion fields permutation is uglier for sure (there
are currently 9 optional completion fields in this proposal, you could
easily do the math to calculate how many permutations we have).

Why would you do every permutation?


Because user-space applications might want to get completion time-stamping and size, but not the rest of the fields. They could pretty much decide which data matters to them and get only that.

Try a non-memcpy API design. Provide an opaque 'cursor' and functions
to return certain data. This has more unconditional branches, but may
avoid branching within the driver and certainly avoid memcpy's and
pretty much all cache line dirtying. Sean had some results that
were positive on this sort of direction.


Does the opaque pointer guarantees an aligned access? Who allocates the space for the vendor's CQE? Any concrete example?
One of the problems here are that CQEs could be formatted as -
"if QP type is y, then copy the field z from o". Doing that this way may result doing the same "if" multiple times. The current approach could still avoid memcpy and write straight to the user's buffer.

Justify this horrible API is *necessary* with something concrete, just
not random hand waving and mumbling about cache lines. That means
benchmark several options.

Jason


Matan
[1] https://www.spinics.net/lists/linux-rdma/msg29837.html
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Photo]     [Yosemite News]     [Yosemite Photos]     [Linux Kernel]     [Linux SCSI]     [XFree86]
  Powered by Linux