Re: [PATCH] xfs_io: fix exitcode handling (was Re: generic/399 and xfs_io pwrite command)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Dec 07, 2017 at 09:05:20AM -0500, Brian Foster wrote:
> On Wed, Dec 06, 2017 at 11:26:38AM +1100, Dave Chinner wrote:
> > On Wed, Dec 06, 2017 at 08:18:50AM +1100, Dave Chinner wrote:
> > > On Tue, Dec 05, 2017 at 04:23:43PM +0200, Ari Sundholm wrote:
> > > > On 12/05/2017 12:28 AM, Dave Chinner wrote:
> > > > >On Thu, Nov 30, 2017 at 04:21:47PM +0200, Ari Sundholm wrote:
> ...
> > 
> > xfs_io: set exitcode on failure appropriately
> > 
> > From: Dave Chinner <dchinner@xxxxxxxxxx>
> > 
> > Many operations don't set the exitcode when they fail, resulting
> > in xfs_io exiting with a zero (no failure) exit code despite the
> > command failing and returning an error. Fix this.
> > 
> > Signed-Off-By: Dave Chinner <dchinner@xxxxxxxxxx>
> > ---
> ...
> > diff --git a/io/copy_file_range.c b/io/copy_file_range.c
> > index d1dfc5a5e33a..9b737eff4c02 100644
> > --- a/io/copy_file_range.c
> > +++ b/io/copy_file_range.c
> > @@ -92,6 +92,7 @@ copy_range_f(int argc, char **argv)
> >  	int ret;
> >  	int fd;
> >  
> > +	exitcode = 1;
> >  	while ((opt = getopt(argc, argv, "s:d:l:")) != -1) {
> >  		switch (opt) {
> >  		case 's':
> > @@ -132,6 +133,8 @@ copy_range_f(int argc, char **argv)
> >  
> >  	ret = copy_file_range(fd, &src, &dst, len);
> >  	close(fd);
> > +	if (ret >= 0)
> > +		exitcode = 0;
> 
> I don't think it's appropriate to blindly overwrite the global exitcode
> value like this. What about the case where multiple commands are chained
> together? Should we expect an error code if any of the commands failed
> or only the last?
> 
> For example:
> 
>   xfs_io -c "pwrite ..." <file>
> 
> ... returns an error if the write fails, while:
> 
>   xfs_io -c "pwrite ..." -c "fadvise ..." <file>
> 
> ... would never so long as the fadvise succeeds.

Yup, I mentioned that this would be a problem on IRC. The patch fixes
the interactive and the single command cases, but won't work for
chained commands as you rightly point out.

To fix it properly, it's going to take a lot more than 15 minutes of
time. We're going to have to change how errors are returned to and
propagated through the libxcmd infrastruture, how we handle "fatal
error, don't continue" conditions through the libxcmd command loop,
etc. If we want to stop at the first error, then we've got to go
change all the return values from xfs_io commands to return non-zero
on error. We still have to set the exitcode as per this patch,
because the non-zero return value simply says "stop parsing input"
and doesn't affect the exit code of the program.

Given that xfs_quota, xfs_io, xfs_db, and xfs_spaceman all use the
same libxcmd infrastructure, we've got to make the changes
consistently across all of those utilities. This will require an
audit of all the commands first to determine if there's anything
special in any of those utilities, then changing all the code, then
testing all the CLI parsing still works correctly.  xfs_quota, as
usual, will be the problem child that causes us lots of pain here.

I'm not planning on doing all this in the near future, so I did the
next best thing that would only affect xfs_io behaviour. i.e.
directly manipulate the exitcode as many of the existing xfs_io
commands already do.

Cheers,

Dave.
-- 
Dave Chinner
david@xxxxxxxxxxxxx
--
To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [XFS Filesystem Development (older mail)]     [Linux Filesystem Development]     [Linux Audio Users]     [Yosemite Trails]     [Linux Kernel]     [Linux RAID]     [Linux SCSI]


  Powered by Linux