On Mon, Oct 19, 2015 at 12:43:39PM -0400, ira.weiny@xxxxxxxxx wrote: > From: Mitko Haralanov <mitko.haralanov@xxxxxxxxx> > > Expected receives work by user-space libraries (PSM) calling into the driver > with information about the user's receive buffer and have the driver DMA-map > that buffer and program the HFI to receive data directly into it. > > This is an expensive operation as it requires the driver to pin the pages which > the user's buffer maps to, DMA-map them, and then program the HFI. > > When the receive is complete, user-space libraries have to call into the driver > again so the buffer is removed from the HFI, un-mapped, and the pages unpinned. > > All of these operations are expensive, considering that a lot of applications > (especially micro-benchmarks) use the same buffer over and over. > > In order to get better performance for user-space applications, it is highly > beneficial that they don't continuously call into the driver to register and > unregister the same buffer. Rather, they can register the buffer and cache it > for future work. The buffer can be unregistered when it is freed by the user. > > This change implements such buffer caching by making use of the kernel's MMU > notifier API. User-space libraries call into the driver only when the need to > register a new buffer. > > Once a buffer is registered, it stays programmed into the HFI until the kernel > notifies the driver that the buffer has been freed by the user. At that time, > the user-space library is notified and it can do the necessary work to remove > the buffer from its cache. > > Buffers which have been invalidated by the kernel are not automatically removed > from the HFI and do not have their pages unpinned. Buffers are only completely > removed when the user-space libraries call into the driver to free them. This > is done to ensure that any ongoing transfers into that buffer are complete. > This is important when a buffer is not completely freed but rather it is > shrunk. The user-space library could still have uncompleted transfers into the > remaining buffer. > > With this feature, it is important that systems are setup with reasonable > limits for the amount of lockable memory. Keeping the limit at "unlimited" (as > we've done up to this point), may result in jobs being killed by the kernel's > OOM due to them taking up excessive amounts of memory. > > This commit also includes some code clean-up and rearrangement to make the > driver code base easier to maintain and develop. Please do cleanup and rearrangement in a separate patch (ideally before this one) as this is impossible to review as-is. thanks, greg k-h _______________________________________________ devel mailing list devel@xxxxxxxxxxxxxxxxxxxxxx http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel