Re: [PATCH] Portability of dash to legacy systems, such as AT&T Unix PC : 11-simplify-wait-loop

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Nov 12, 2024 at 11:44:12PM +0100, Alain Knaff wrote:
> 
> Currently, the "wait" shell builtin is implemented by calling wait3
> non-blockingly, and relying on a subsequent sigsuspend to do the
> actual waiting.
> 
> This is needlessly complex, wait3 can do the job just fine on its
> own. Wait_cmd can still be interrupted by a ctrl-c signal. Indeed,
> that's what EINTR is for... just test for !pending_sig, like done
> elsewhere in dash.
> 
> Return of 0 for DOWAIT_WAITCMD is now done using an explicit test
> before return of the function.

This code was added because wait3 is racy:

commit 3800d4934391b144fd261a7957aea72ced7d47ea
Author: Herbert Xu <herbert@xxxxxxxxxxxxxxxxxxx>
Date:   Sun Feb 22 18:10:01 2009 +0800

    [JOBS] Fix dowait signal race

    This test program by Alexey Gladkov can cause dash to enter an
    infinite loop in waitcmd.

    #!/bin/dash
    trap "echo TRAP" USR1
    stub() {
    echo ">>> STUB $1" >&2
    sleep $1
    echo "<<< STUB $1" >&2
    kill -USR1 $$
    }
    stub 3 &
    stub 2 &
    until { echo "###"; wait; } do
    echo "*** $?"
    done

    The problem is that if we get a signal after the wait3 system
    call has returned but before we get to INTON in dowait, then
    we can jump back up to the top and lose the exit status.  So
    if we then wait for the job that has just exited, then it'll
    stay there forever.

    I made the original change that caused this bug to fix pretty
    much the same bug but in the opposite direction.  That is, if
    we get a signal after we enter wait3 but before we hit the kernel
    then it too can cause the wait to go on forever (assuming the
    child doesn't exit).

    In fact this is pretty much exactly the scenario that you'll
    find in glibc's documentation on pause().  The solution is given
    there too, in the form of sigsuspend, which is the only way to
    do the check and wait atomically.

    So this patch fixes Alexey's race without reintroducing the old
    bug by converting the blocking wait3 to a sigsuspend.

    In order to do this we need to set a signal handler for SIGCHLD,
    so the code has been modified to always do that.

    Signed-off-by: Herbert Xu <herbert@xxxxxxxxxxxxxxxxxxx>

Cheers,
-- 
Email: Herbert Xu <herbert@xxxxxxxxxxxxxxxxxxx>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt




[Index of Archives]     [LARTC]     [Bugtraq]     [Yosemite Forum]     [Photo]

  Powered by Linux