On Tue, Mar 15, 2016 at 09:03:49AM +0200, Or Gerlitz wrote:
Doug, As Dennis noted, you already picked ~300 patches for 4.6 / hfi1
driver from him. I don't see why picking this patch series there too
and not letting us to further discuss it out, thoughts?
Forgot in my last reply but I did want to mention this again, we have not
submitted 300 patches for hfi1. We have submitted 300+ patches across 3
different drivers. Doug's GitHub branch says hfi1, but it is changes for
hfi1, rdmavt, and qib.
I believe in taking an iterative approach to kernel development, if the code
doesn't cause problems I don't see any harm adding the series. If Doug
agrees with you and doesn't like this approach but is willing to let it
stand for now, I think the changes can be done in a subsequent release.
Iterative approach doesn't say we should pick code into the kernel and throw
it away later. There are rare times it happens, when people were not able
to agree on something for years and the maintainer say enough is enough,
we'll take it and see later.
Never said anything about throwing it all away. My point is, we are talking
about a single driver and exactly one user of it. It's not like we are
adding core kernel functionality here that impacts multiple drivers, or have
to worry about changing ABIs across a bunch of different users. We have the
freedom to fix and change things as needed.
Regardless that stuff aside, I wanted to talk to some folks who are
more familiar with the inner workings of psm than I. So I appreciate the
patience while I gathered the feedback from interested parties.
So basically what happens in this series is that psm calls into the kernel
to do an sdma transfer, the caching is a side effect that happens under the
covers. We can see this from the code in this series. What may not be so
apparent is that the user library does not have to do anything to take
advantage of the cache. Further it doesn't even have to know about the
cache. Which is one of the appeals of this patch series. The other, and
what trumps all, is performance.
Taking a performance standpoint we don't want this in user space. If the
cache were at the user level there would end up being a system call to pin
the buffer, another to do the SDMA transfer from said buffer, and another to
unpin the buffer when the kernel notifier has invalidated that buffer. This
gets even more complex when you consider cache evictions due to reaching the
limit on pinned pages.
These extra system calls add overhead, which may be acceptable for a normal
server or desktop environment, but are not acceptable when dealing with a
high performance interconnect where we want the absolute best performance.
I should also point out that these extra calls would be done on a per-buffer
basis, not per-context, or per-application.
-Denny
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html