On Tue, Dec 05, 2017 at 04:23:43PM +0200, Ari Sundholm wrote: > On 12/05/2017 12:28 AM, Dave Chinner wrote: > >On Thu, Nov 30, 2017 at 04:21:47PM +0200, Ari Sundholm wrote: > >>Hi! > >> > >>While debugging an issue, we found out that generic/399 seems to > >>rely on a behavior that is specific to ext4 in the following section > >>of code: > >>----------------8<--------snip--------8<------------------------------ > >># > >># Write files of 1 MB of all the same byte until we hit ENOSPC. > >>Note that we > >># must not create sparse files, since the contents of sparse files are not > >># stored on-disk. Also, we create multiple files rather than one big file > >># because we want to test for reuse of per-file keys. > >># > >>total_file_size=0 > >>i=1 > >>while true; do > >> file=$SCRATCH_MNT/encrypted_dir/file$i > >> if ! $XFS_IO_PROG -f $file -c 'pwrite 0 1M' &> $tmp.out; then > >> if ! grep -q 'No space left on device' $tmp.out; then > >> echo "FAIL: unexpected pwrite failure" > >> cat $tmp.out > >> elif [ -e $file ]; then > >> total_file_size=$((total_file_size + $(stat -c %s $file))) > >> fi > >> break > >> fi > >> total_file_size=$((total_file_size + $(stat -c %s $file))) > >> i=$((i + 1)) > >> if [ $i -gt $fs_size_in_mb ]; then > >> echo "FAIL: filesystem never filled up!" > >> break > >> fi > >>done > >>----------------8<--------snip--------8<------------------------------ > >> > >>What happens with ext4 is that the xfs_io command gives a nonzero > >>exit value not when the pwrite command fails with ENOSPC but during > >>the *next* iteration when opening the file fails with ENOSPC. Turns > >>out the pwrite command failing does not cause xfs_io to give a > >>nonzero exit value. > > > >That implies ext4 is returning zero bytes written to the pwrite() > >call rather than ENOSPC. i.e.: > > > > bytes = do_pwrite(file->fd, off, cnt, buffersize, > > pwritev2_flags); > > if (bytes == 0) > > break; > > if (bytes < 0) { > > perror("pwrite"); > > return -1; > > } > > > >So if it's exiting with no error, then we can't have got an error > >from ext4 at ENOSPC. If that's the case, it probably should be > >considered an ext4 bug, not an issue with xfs_io... > > > > No, according to what we've observed, that is not what happens. The > pwrite() call does fail and errno is ENOSPC after the call. The > immediate problem is that xfs_io does not reflect this failure in > its exit value and thus the check in generic/399 does not work in > this case. Only when open() fails during the next iteration does > xfs_io give a nonzero exit value and cause the check in the test > case to allow the test case to end successfully. Ah, io/pwrite.c fails to set exitcode on failure iappropriately before returning. Looks like there is a bunch of xfs_io commands that fail to do this properly. > What is specific to ext4 here is, as stated in my original message, > that open() fails. Because there are no more inodes to allocate (i.e. inode table is full) and so attempts to create more fail with ENOSPC. The xfs_io open command set exitcode appropriately, and that's why it finally triggers a test abort. Looks like xfs_io does need fixing to set exitcode appropriately on all failures... Cheers, Dave. -- Dave Chinner david@xxxxxxxxxxxxx -- To unsubscribe from this list: send the line "unsubscribe fstests" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html