On Tue, Feb 09, 2016 at 10:16:23PM +0000, Al Viro wrote: > * sort out the cancel semantics OK, it's definitely bogus right now. Look: read() grabs a slot and issues read request. Daemon picks it and starts processing, copying the data into the corresponding part of shared memory. read() gets interrupted and tries to issue a cancel, which fails to even be seen by the daemon, since wait_for_cancel_downcall() sees the pending signal and rips the cancel request out of the list. read() marks the slot free and buggers off. write() is called by another process, picks the same slot and starts copying the userland data into the same part of shared memory. Where it's overwritten by daemon still processing the read request - it hadn't seen any indications of things going wrong. This is obviously wrong - killed read() should *NOT* end up with the data it would've returned being silently mixed into the data being written by write() in unrelated process on unrelated file. And the version in orangefs-2.9.3.tar.gz (your Frankenstein module?) is vulnerable to the same race. 2.8.1 isn't - it ignores signals on the cancel, but that means waiting for cancel to be processed (or timed out) on any interrupted read() before we return to userland. We can return to that behaviour, of course, but I suspect that offloading it to something async (along with freeing the slot used by original operation) would be better from QoI point of view. -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html