Re: [PATCH 1/2] pipe: check limits only when increasing pipe capacity

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Andrew,

thanks for picking up this patch series in -mm. Please drop it.
After discussions with Vegard, I have something better now.

Cheers,

Michael

On 08/16/2016 11:10 PM, Michael Kerrisk (man-pages) wrote:
> When changing a pipe's capacity with fcntl(F_SETPIPE_SZ), various
> limits defined by /proc/sys/fs/pipe-* files are checked to see
> if unprivileged users are exceeding limits on memory consumption.
> 
> While documenting and testing the operation of these limits I noticed
> that, as currently implemented, these checks can lead to cases where
> a user can increase a pipe's capacity and is then unable to decrease
> the capacity. The origin of the problem is two-fold:
> 
> (1) When increasing the pipe capacity, the checks against the limits
>     in /proc/sys/fs/pipe-user-pages-{soft,hard} are made against
>     existing consumption, and exclude the memory required for the
>     increased pipe capacity. The new increase in pipe capacity
>     can then push the total memory used by the user for pipes
>     (possibly far) over a limit.
> 
> (2) The limit checks are performed even when the new pipe capacity
>     is less than the existing pipe capacity. This can lead to
>     problems if a user sets a large pipe capacity, and then the
>     limits are lowered, with the result that the user will no
>     longer be able to decrease the pipe capacity.
> 
> The simple solution given by this patch is to perform the checks
> only when the pipe capacity is being increased. The patch does not
> address the broken check in (1), which allows a user to (one-time)
> set a pipe capacity that pushes the user's consumption over the user
> pipe limits. A change to fix that check is proposed in a subsequent
> patch. I've separated the two fixes because the second fix is a
> little more complex, and could possibly (though unlikely) break
> existing user-space. The current patch implements the simple fix
> that carries little risk and seems obviously correct: allowing an
> unprivileged user always to decrease a pipe's capacity.
> 
> The program below can be used to demonstrate the problem, and the
> effect of the fix. The program takes one or more command-line
> arguments. The first argument specifies the number of pipes
> that the program should create. The remaining arguments are,
> alternately, pipe capacities that should be set using
> fcntl(F_SETPIPE_SZ), and sleep intervals (in seconds) between
> the fcntl() operations. (The sleep intervals allow the possibility
> to change the limits between fcntl() operations.)
> 
> Running this program on an unpatched kernel, we first set some limits:
> 
>     # getconf PAGESIZE
>     4096
>     # echo 0 > /proc/sys/fs/pipe-user-pages-soft
>     # echo 1000000000 > /proc/sys/fs/pipe-max-size
>     # echo 10000 > /proc/sys/fs/pipe-user-pages-hard    # 40.96 MB
> 
> Now perform two fcntl(F_SETPIPE_SZ) operations on a single pipe,
> first setting a pipe capacity (10MB), sleeping for a few seconds,
> during which time the hard limit is lowered, and then set pipe
> capacity to a smaller amount (5MB):
> 
>     # sudo -u mtk ./test_F_SETPIPE_SZ 1 10000000 15 5000000 &
>     [1] 748
>     #     Loop 1: set pipe capacity to 10000000 bytes
>             F_SETPIPE_SZ returned 16777216
>             Sleeping 15 seconds
> 
>     # echo 1000 > /proc/sys/fs/pipe-user-pages-hard      # 4.096 MB
> 
>     #     Loop 2: set pipe capacity to 5000000 bytes
>             Loop 2, pipe 0: F_SETPIPE_SZ failed: fcntl: Operation not permitted
> 
> In this case, the user should be able to lower the limit.
> 
> With a kernel that has the patch below, the second fcntl() 
> succeeds:
> 
>     # echo 0 > /proc/sys/fs/pipe-user-pages-soft
>     # echo 1000000000 > /proc/sys/fs/pipe-max-size
>     # echo 10000 > /proc/sys/fs/pipe-user-pages-hard
>     # sudo -u mtk ./test_F_SETPIPE_SZ 1 10000000 15 5000000 &
>     [1] 3215
>     #     Loop 1: set pipe capacity to 10000000 bytes
>             F_SETPIPE_SZ returned 16777216
>             Sleeping 15 seconds
> 
>     # echo 1000 > /proc/sys/fs/pipe-user-pages-hard
> 
>     #     Loop 2: set pipe capacity to 5000000 bytes
>             F_SETPIPE_SZ returned 8388608
> 
> 8x---8x---8x---8x---8x---8x---8x---8x---8x---8x---8x---8x---8x---8x---
> 
> /* test_F_SETPIPE_SZ.c
> 
>    (C) 2016, Michael Kerrisk; licensed under GNU GPL version 2 or later
> 
>    Test operation of fcntl(F_SETPIPE_SZ) for setting pipe capacity
>    and interactions with limits defined by /proc/sys/fs/pipe-* files.
> */
> 
> int
> main(int argc, char *argv[])
> {
>     int (*pfd)[2];
>     int npipes;
>     int pcap, rcap;
>     int j, p, s, stime, loop;
> 
>     if (argc < 2) {
>         fprintf(stderr, "Usage: %s num-pipes "
>                 "[pipe-capacity sleep-time]...\n", argv[0]);
>         exit(EXIT_FAILURE);
>     }
> 
>     npipes = atoi(argv[1]);
> 
>     pfd = calloc(npipes, sizeof (int [2]));
>     if (pfd == NULL) {
>         perror("calloc");
>         exit(EXIT_FAILURE);
>     }
> 
>     for (j = 0; j < npipes; j++) {
>         if (pipe(pfd[j]) == -1) {
>             fprintf(stderr, "Loop %d: pipe() failed: ", j);
>             perror("pipe");
>             exit(EXIT_FAILURE);
>         }
>     }
> 
>     for (j = 2; j < argc; j += 2 ) {
>         loop = j / 2;
>         pcap = atoi(argv[j]);
>         printf("    Loop %d: set pipe capacity to %d bytes\n", loop, pcap);
> 
>         for (p = 0; p < npipes; p++) {
>             s = fcntl(pfd[p][0], F_SETPIPE_SZ, pcap);
>             if (s == -1) {
>                 fprintf(stderr, "        Loop %d, pipe %d: F_SETPIPE_SZ "
>                         "failed: ", loop, p);
>                 perror("fcntl");
>                 exit(EXIT_FAILURE);
>             }
> 
>             if (p == 0) {
>                 printf("        F_SETPIPE_SZ returned %d\n", s);
>                 rcap = s;
>             } else {
>                 if (s != rcap) {
>                     fprintf(stderr, "        Loop %d, pipe %d: F_SETPIPE_SZ "
>                             "unexpected return: %d\n", loop, p, s);
>                     exit(EXIT_FAILURE);
>                 }
>             }
> 
>             stime = (j + 1 < argc) ? atoi(argv[j + 1]) : 0;
>             if (stime > 0) {
>                 printf("        Sleeping %d seconds\n", stime);
>                 sleep(stime);
>             }
>         }
>     }
> 
>     exit(EXIT_SUCCESS);
> }
> 
> 8x---8x---8x---8x---8x---8x---8x---8x---8x---8x---8x---8x---8x---8x---
> 
> Cc: Willy Tarreau <w@xxxxxx>
> Cc: Vegard Nossum <vegard.nossum@xxxxxxxxxx>
> Cc: socketpair@xxxxxxxxx
> Cc: Tetsuo Handa <penguin-kernel@xxxxxxxxxxxxxxxxxxx>
> Cc: Jens Axboe <axboe@xxxxxx>
> Cc: Al Viro <viro@xxxxxxxxxxxxxxxxxx>
> Cc: stable@xxxxxxxxxxxxxxx
> Cc: linux-api@xxxxxxxxxxxxxxx
> Cc: linux-kernel@xxxxxxxxxxxxxxx
> Signed-off-by: Michael Kerrisk <mtk.manpages@xxxxxxxxx>
> ---
>  fs/pipe.c | 25 +++++++++++++++++--------
>  1 file changed, 17 insertions(+), 8 deletions(-)
> 
> diff --git a/fs/pipe.c b/fs/pipe.c
> index 4ebe6b2..a98ebca 100644
> --- a/fs/pipe.c
> +++ b/fs/pipe.c
> @@ -1122,14 +1122,23 @@ long pipe_fcntl(struct file *file, unsigned int cmd, unsigned long arg)
>  		if (!nr_pages)
>  			goto out;
>  
> -		if (!capable(CAP_SYS_RESOURCE) && size > pipe_max_size) {
> -			ret = -EPERM;
> -			goto out;
> -		} else if ((too_many_pipe_buffers_hard(pipe->user) ||
> -			    too_many_pipe_buffers_soft(pipe->user)) &&
> -		           !capable(CAP_SYS_RESOURCE) && !capable(CAP_SYS_ADMIN)) {
> -			ret = -EPERM;
> -			goto out;
> +		/*
> +		 * If trying to increase the pipe capacity, check that an
> +		 * unprivileged user is not trying to exceed various limits.
> +		 * (Decreasing the pipe capacity is always permitted, even
> +		 * if the user is currently over a limit.)
> +		 */
> +		if (nr_pages > pipe->buffers) {
> +			if (!capable(CAP_SYS_RESOURCE) && size > pipe_max_size) {
> +				ret = -EPERM;
> +				goto out;
> +			} else if ((too_many_pipe_buffers_hard(pipe->user) ||
> +				too_many_pipe_buffers_soft(pipe->user)) &&
> +				!capable(CAP_SYS_RESOURCE) &&
> +				!capable(CAP_SYS_ADMIN)) {
> +				ret = -EPERM;
> +				goto out;
> +			}
>  		}
>  		ret = pipe_set_size(pipe, nr_pages);
>  		break;
> 


-- 
Michael Kerrisk
Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
Linux/UNIX System Programming Training: http://man7.org/training/
--
To unsubscribe from this list: send the line "unsubscribe stable" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux Kernel]     [Kernel Development Newbies]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite Hiking]     [Linux Kernel]     [Linux SCSI]