On Sat, Dec 20, 2014 at 03:25:01PM +0800, Xing Gu wrote: > This case tests truncate/collapse range race. If > the race occurs, it will trigger BUG_ON. > > Signed-off-by: Xing Gu <gux.fnst@xxxxxxxxxxxxxx> > --- What changed from the previous version? ... > +rm -f $seqres.full > +_scratch_mkfs >>$seqres.full 2>&1 > +_scratch_mount > + > +old_bug=`dmesg | grep -c "kernel BUG"` > + > +testfile=$SCRATCH_MNT/file.$seq > +# fcollapse/truncate continuously and simultaneously a same file > +for ((i=1; i <= 100; i++)); do > + for ((i=1; i <= 1000; i++)); do > + $XFS_IO_PROG -f -c 'truncate 100k' $testfile 2>> $seqres.full > + $XFS_IO_PROG -f -c 'fcollapse 0 16k' $testfile 2>> $seqres.full > + done & > + for ((i=1; i <= 1000; i++)); do > + $XFS_IO_PROG -f -c 'truncate 0' $testfile 2>> $seqres.full > + done & > +done The previous version of this ran a loop for 3 minutes, which we talked about being too long. This loop forks 300,000 processes and generates a 1.5MB $seqres.full file. On my single CPU test VM it takes: generic/039 302s About 5 minutes to run, so it takes longer than the 3 minute version of the same test we said was too long. FYI, my 16p test VM still takes 35s to crunch through this test and it pegs all 16 CPUs to 100% usage. We don't need to record the output of the xfs_io commands, so avoiding a fork and throwing away the output such as: $XFS_IO_PROG -f -c 'truncate 100k' \ -c 'fcollapse 0 16k' \ $testfile > /dev/null 2>&1 makes the runtime on the 16p VM drop by 40% (22s) and by 33% (200s) on the single CPU VM. but that's still too long on the smaller CPU systems. I think the loop iterations need to be tuned to the number of CPUs in the system. This: NCPUS=`$here/src/feature -o` OUTER_LOOPS=$((10 * $NCPUS * $LOAD_FACTOR)) INNER_LOOPS=$((50 * $NCPUS * $LOAD_FACTOR)) plus the above xfs_io optimisations give a runtime of 3s on my 1p machien and 30s on my 16p machine. That would be more acceptible to everyone, I think. > +wait > + > +new_bug=`dmesg | grep -c "kernel BUG"` > +if [ $new_bug -ne $old_bug ]; then > + _fail "kernel bug detected, check dmesg for more infomation." > +fi A kernel bug in a process with an open file descriptor will cause the filesystem to be unmountable. It will hang the test, require a reboot. Hence there's no point in checking dmesg for a bug message as it will be noticed by the test failing to complete. > +status=0 > +exit > diff --git a/tests/generic/039.out b/tests/generic/039.out > new file mode 100644 > index 0000000..0cacac7 > --- /dev/null > +++ b/tests/generic/039.out > @@ -0,0 +1 @@ > +QA output created by 039 The test needs to echo something to indicate that an empty golden output file is expected. "Silence is golden" is the usual phrase here.... > 036 auto aio rw stress > 037 metadata auto quick > 038 auto stress > +039 auto metadata rw With the addition of $LOAD_FACTOR, this can be added to the stress group as well. Cheers, Dave. -- Dave Chinner david@xxxxxxxxxxxxx -- To unsubscribe from this list: send the line "unsubscribe fstests" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html