Re: Couldn't wait for ssh process: No child processes

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Mar 03, 2010 at 12:28:18PM -0600, Dave Jones wrote:
> I am using swatch to monitor the vsftpd log and the secure log for FTP
> and SFTP inbound transfers then trigger a script to send the file
> outbound.  When I use the sftp command to send a file outbound, I am
> getting "Couldn't wait for ssh process: No child processes" with a
> return code of 255.  Here's the strange thing:

RC 255 == RC -1, which I believe very often means that a child program
could not be fork()'d or exec()'d.  My first guess -- and it's just a
guess -- is that your script does not specify the path to your sftp
binary, and it's not in the $PATH that swatch passes to its children.
Adding the second shell script may somehow reinstate the $PATH that
contains the path to the binary.  To be honest, based on what you
described, this scenario seems a little bit unlikely; but I can't
think of one that's more likely.  That this succeeds when piped
through tee (though, *what* is piped through tee?), seems like a very
odd wild card.  I might be inclined to think that this had something
to do with stdin/stdout being or not being a tty, but it looks to me
like it should not be a tty in all 3 cases...  Hard to say for sure
though.  Depends on what swatch does, and on what the scripts do.

> swatch -> script2.sh -> sftp command -> fails with RC 255
> swatch -> script1.sh -> script2.sh -> sftp command -> successful with RC 0
> swatch -> script2.sh -> sftp command piped through tee -> successful with RC 255

This little diagram tells us surprisingly little of importance.  We
don't positively know:

 - what the value of $PATH is at any given point
 - what script2.sh looks like, and most importantly how it's invoked
 - what script1.sh looks like, and most importantly how it's invoked
 - whether or not stdin/stdout is a TTY at any given point
 - where (i.e. what program) the error message you quoted came from
 - what platform(s) this is (though it looks like the server is cygwin)
 - etc.

If your shell is bash, then how it is invoked by the shell script
actually could matter quite a lot.  If script2.sh is invoked as:

  #!/bin/sh

but script1.sh is invoked as:

  #!/bin/bash

...then this might actually lend support to my theory above.  In such a
case, the script invoked as /bin/bash could possibly source the .rc
files (i.e. of the user as which the script was invoked, which could change
the environment of its children quite drastically (for example, if
$BASH_ENV is set in the environment -- unlikely but possible).

And of course, if either of these scripts sets any environment
variables or otherwise manipulates the environment the script runs in,
all bets are off.

Another possibility is that the arguments are not quoted properly as
passed from swatch to your script, *AND* they're not quoted properly
in script1.sh, and the two wrongs in this case actually do make a
right; e.g. swatch passes the args all as one string, and the
intermediate script expands the string without quoting such that each
token is passed as a distinct argument, which by chance is actually
what was intended in the first place, so it works "by accident"
essentially.

From the log of the failed transfer:

> debug1: Transferred: stdin 0, stdout 0, stderr 0 bytes in 1.3 seconds
> debug1: Bytes per second: stdin 0.0, stdout 0.0, stderr 0.0
> debug1: Exit status 0
> Couldn't wait for ssh process: No child processes^M
> + RC=255

This seems to suggest that whatever ssh command was run by swatch
actually succeeded (debug1: Exit status 0), but just didn't do what
you expected it to.  It could be caused by a problem with command-line
quoting either in your script itself, or in how swatch calls it; and
somehow interposing the extra script fixes it.  Or, it could be that
the error is not related to SSH at all.  Or, all that could be
complete hogwash.

From your "successful" log:

> debug1: Transferred: stdin 0, stdout 0, stderr 0 bytes in 1.3 seconds
> debug1: Bytes per second: stdin 0.0, stdout 0.0, stderr 0.0
> debug1: Exit status 0
> Couldn't wait for ssh process: No child processes^M
> + RC=0

Note that the same error message appears.  Either this output was from
a session that was not really successful, or maybe the error message
is irrelevant to the problem?  Without doing a side-by-side comparison
of the two logs, it looks to me like the only significant difference
was the RC itself.  That suggests to me that the problem is not with
the SSH session, but with some other command that's in the script, or
with the way swatch is invoking the child.  Or, maybe it's a cygwin
sshd bug...  But, unless one of my guesses was right, I don't think
there's enough information here to figure out what the problem is.  In
particular, were I to spend more time on this problem, I'd want to
know what platform(s) this is running on, and see the actual scripts
involved.

Hope that helps...

-- 
Derek D. Martin
http://www.pizzashack.org/
GPG Key ID: 0x81CFE75D

Attachment: pgputiF6znH3s.pgp
Description: PGP signature


[Index of Archives]     [Open SSH Unix Development]     [Fedora Users]     [Fedora Desktop]     [Yosemite Backpacking]     [KDE Users]     [Gnome Users]

  Powered by Linux