Re: generic/399 and xfs_io pwrite command

[Date Prev] [Date Next] [Thread Prev] [Thread Next] [Date Index] [Thread Index]



On Tue, Dec 05, 2017 at 04:23:43PM +0200, Ari Sundholm wrote:
> On 12/05/2017 12:28 AM, Dave Chinner wrote:
> >On Thu, Nov 30, 2017 at 04:21:47PM +0200, Ari Sundholm wrote:
> >>Hi!
> >>
> >>While debugging an issue, we found out that generic/399 seems to
> >>rely on a behavior that is specific to ext4 in the following section
> >>of code:
> >>----------------8<--------snip--------8<------------------------------
> >>#
> >># Write files of 1 MB of all the same byte until we hit ENOSPC.
> >>Note that we
> >># must not create sparse files, since the contents of sparse files are not
> >># stored on-disk.  Also, we create multiple files rather than one big file
> >># because we want to test for reuse of per-file keys.
> >>#
> >>total_file_size=0
> >>i=1
> >>while true; do
> >>	file=$SCRATCH_MNT/encrypted_dir/file$i
> >>	if ! $XFS_IO_PROG -f $file -c 'pwrite 0 1M' &> $tmp.out; then
> >>		if ! grep -q 'No space left on device' $tmp.out; then
> >>			echo "FAIL: unexpected pwrite failure"
> >>			cat $tmp.out
> >>		elif [ -e $file ]; then
> >>			total_file_size=$((total_file_size + $(stat -c %s $file)))
> >>		fi
> >>		break
> >>	fi
> >>	total_file_size=$((total_file_size + $(stat -c %s $file)))
> >>	i=$((i + 1))
> >>	if [ $i -gt $fs_size_in_mb ]; then
> >>		echo "FAIL: filesystem never filled up!"
> >>		break
> >>	fi
> >>done
> >>----------------8<--------snip--------8<------------------------------
> >>
> >>What happens with ext4 is that the xfs_io command gives a nonzero
> >>exit value not when the pwrite command fails with ENOSPC but during
> >>the *next* iteration when opening the file fails with ENOSPC. Turns
> >>out the pwrite command failing does not cause xfs_io to give a
> >>nonzero exit value.
> >
> >That implies ext4 is returning zero bytes written to the pwrite()
> >call rather than ENOSPC. i.e.:
> >
> >                 bytes = do_pwrite(file->fd, off, cnt, buffersize,
> >                                 pwritev2_flags);
> >                 if (bytes == 0)
> >                         break;
> >                 if (bytes < 0) {
> >                         perror("pwrite");
> >                         return -1;
> >                 }
> >
> >So if it's exiting with no error, then we can't have got an error
> >from ext4 at ENOSPC. If that's the case, it probably should be
> >considered an ext4 bug, not an issue with xfs_io...
> >
> 
> No, according to what we've observed, that is not what happens. The
> pwrite() call does fail and errno is ENOSPC after the call. The
> immediate problem is that xfs_io does not reflect this failure in
> its exit value and thus the check in generic/399 does not work in
> this case. Only when open() fails during the next iteration does
> xfs_io give a nonzero exit value and cause the check in the test
> case to allow the test case to end successfully.

Ah, io/pwrite.c fails to set exitcode on failure iappropriately
before returning. Looks like there is a bunch of xfs_io commands
that fail to do this properly.

> What is specific to ext4 here is, as stated in my original message,
> that open() fails.

Because there are no more inodes to allocate (i.e. inode table is
full) and so attempts to create more fail with ENOSPC. The xfs_io
open command set exitcode appropriately, and that's why it finally
triggers a test abort.

Looks like xfs_io does need fixing to set exitcode appropriately on
all failures...

Cheers,

Dave.
-- 
Dave Chinner
david@xxxxxxxxxxxxx
--
To unsubscribe from this list: send the line "unsubscribe fstests" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux Filesystems Development]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux