[This patch series is an improvement on a smaller series I sent earlier to fix the user limit handling for pipes. I've made many changes after feedback from Vegard Nossum, including the addition of a fix for point (3) below.] When changing a pipe's capacity with fcntl(F_SETPIPE_SZ), various limits defined by /proc/sys/fs/pipe-* files are checked to see if unprivileged users are exceeding limits on memory consumption. While documenting and testing the operation of these limits I noticed that, as currently implemented, these checks have a number of problems: (1) When increasing the pipe capacity, the checks against the limits in /proc/sys/fs/pipe-user-pages-{soft,hard} are made against existing consumption, and exclude the memory required for the increased pipe capacity. The new increase in pipe capacity can then push the total memory used by the user for pipes (possibly far) over a limit. This can also trigger the problem described next. (2) The limit checks are performed even when the new pipe capacity is less than the existing pipe capacity. This can lead to problems if a user sets a large pipe capacity, and then the limits are lowered, with the result that the user will no longer be able to decrease the pipe capacity. (3) As currently implemented, accounting and checking against the limits is done as follows: (a) Test whether the user has exceeded the limit. (b) Make new pipe buffer allocation. (c) Account new allocation against the limits. This is racey. Multiple processes may pass point (a) simultaneously, and then allocate pipe buffers that are accounted for only in step (c). The race means that the user's pipe buffer allocation could be pushed over the limit (by an arbitrary amount, depending on how unlucky we were in the race). [Thanks to Vegard Nossum for spotting this point, which I had missed.] This patch series addresses these three problems. Cc: Willy Tarreau <w@xxxxxx> Cc: Vegard Nossum <vegard.nossum@xxxxxxxxxx> Cc: socketpair@xxxxxxxxx Cc: Tetsuo Handa <penguin-kernel@xxxxxxxxxxxxxxxxxxx> Cc: Jens Axboe <axboe@xxxxxx> Cc: Al Viro <viro@xxxxxxxxxxxxxxxxxx> Cc: linux-api@xxxxxxxxxxxxxxx Cc: linux-kernel@xxxxxxxxxxxxxxx Signed-off-by: Michael Kerrisk <mtk.manpages@xxxxxxxxx> Michael Kerrisk (8): pipe: relocate round_pipe_size() above pipe_set_size() pipe: move limit checking logic into pipe_set_size() pipe: refactor argument for account_pipe_buffers() pipe: fix limit checking in pipe_set_size() pipe: simplify logic in alloc_pipe_info() pipe: fix limit checking in alloc_pipe_info() pipe: make account_pipe_buffers() return a value, and use it pipe: cap initial pipe capacity according to pipe-max-size limit fs/pipe.c | 164 +++++++++++++++++++++++++++++++++++--------------------------- 1 file changed, 94 insertions(+), 70 deletions(-) -- 2.5.5 -- Michael Kerrisk Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/ Linux/UNIX System Programming Training: http://man7.org/training/ -- To unsubscribe from this list: send the line "unsubscribe linux-api" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html