Hi Brian, Please find my reply below inline. On Fri, Apr 23, 2021 at 7:07 PM Brian Foster <bfoster@xxxxxxxxxx> wrote: > > On Thu, Apr 22, 2021 at 11:31:47PM +0800, bxue@xxxxxxxxxx wrote: > > From: Boyang Xue <bxue@xxxxxxxxxx> > > > > On ext2/ext3, it's expected that several single block metadata reads can occur > > when writing to file in the same cgroup (the stack is like below[1]). The > > purpose of the "write -> read/write" subtest is to make sure the larger pwrite > > is accounted to the correct cgroup, not necessarily enforce that zero bytes are > > read in service of the write. This patch fixes the sub-test in order to tolerate > > small reads in 1st cgroup. > > > > [1] Callchain of the read: > > > > @ext3_read_bio[ > > submit_bio+1 > > submit_bh_wbc+365 > > ext4_read_bh+72 > > ext4_get_branch+201 > > ext4_ind_map_blocks+382 > > ext4_map_blocks+295 > > _ext4_get_block+170 > > __block_write_begin_int+328 > > ext4_write_begin+541 > > generic_perform_write+213 > > ext4_buffered_write_iter+167 > > new_sync_write+345 > > vfs_write+438 > > __x64_sys_pwrite64+140 > > do_syscall_64+51 > > entry_SYSCALL_64_after_hwframe+68 > > , 5793, 12]: 3 > > > > Signed-off-by: Boyang Xue <bxue@xxxxxxxxxx> > > --- > > Hi, > > > > This patch fix the "write -> read/write" sub-test in order to tolerate > > small reads in service of the write (like read metadata). > > > > Change from v1: > > (1) More details in commit log, including example call stack > > (2) Set the fixed tolerance value to 33792 for accuracy > > (3) Update percentage tolerance value to fixed value 0, where doesn't > > fail the test > > > > Tested pass on ext2/ext3/ext4 x 1k/2k/4k blksize. > > > > Thanks, > > Boyang > > > > tests/generic/563 | 20 +++++++++++++------- > > 1 file changed, 13 insertions(+), 7 deletions(-) > > > > diff --git a/tests/generic/563 b/tests/generic/563 > > index b113eacf..44394b4b 100755 > > --- a/tests/generic/563 > > +++ b/tests/generic/563 > > @@ -60,6 +60,8 @@ check_cg() > > cgname=$(basename $cgroot) > > expectedread=$2 > > expectedwrite=$3 > > + readtol=$4 > > + writetol=$5 > > rbytes=0 > > wbytes=0 > > > > @@ -71,8 +73,8 @@ check_cg() > > awk -F = '{ print $2 }'` > > fi > > > > - _within_tolerance "read" $rbytes $expectedread 5% -v > > - _within_tolerance "write" $wbytes $expectedwrite 5% -v > > + _within_tolerance "read" $rbytes $expectedread $readtol -v > > + _within_tolerance "write" $wbytes $expectedwrite $writetol -v > > } > > > > # Move current process to another cgroup. > > @@ -113,7 +115,7 @@ $XFS_IO_PROG -c "pread 0 $iosize" -c "pwrite 0 $iosize" -c fsync \ > > $SCRATCH_MNT/file >> $seqres.full 2>&1 > > switch_cg $cgdir > > $XFS_IO_PROG -c fsync $SCRATCH_MNT/file > > -check_cg $cgdir/$seq-cg $iosize $iosize > > +check_cg $cgdir/$seq-cg $iosize $iosize 5% 5% Here the write tolerance has to be 5% rather than 0, since testing with ext2, the write bytes are slightly larger than $iosize (8392704 vs 8388608). > > > > # Write from one cgroup then read and write from a second. Writes are charged to > > # the first group and nothing to the second. > > @@ -126,8 +128,12 @@ $XFS_IO_PROG -c "pread 0 $iosize" -c "pwrite 0 $iosize" $SCRATCH_MNT/file \ > > >> $seqres.full 2>&1 > > switch_cg $cgdir > > $XFS_IO_PROG -c fsync $SCRATCH_MNT/file > > -check_cg $cgdir/$seq-cg 0 $iosize > > -check_cg $cgdir/$seq-cg-2 0 0 > > +# Use a fixed value tolerance for the expected value of zero here > > +# because filesystems might perform a small number of metadata reads to > > +# complete the write. On ext2/3 with 1k block size, the read bytes is > > +# as large as 33792. > > +check_cg $cgdir/$seq-cg 0 $iosize 33792 0 > > Shouldn't that last parameter (write tolerance) remain as 5%? In my test on ext2/3/4 with 1k/2k/4k, the write tolerance here always equals $iosize in this sub-test. So make it 0 rather 5%, I hope it will gate future behavior change more sensitively. > > > +check_cg $cgdir/$seq-cg-2 0 0 0 0 > > > > # Read from one cgroup, read & write from a second. Both reads and writes are > > # charged to the first group and nothing to the second. > > @@ -140,8 +146,8 @@ $XFS_IO_PROG -c "pread 0 $iosize" -c "pwrite 0 $iosize" $SCRATCH_MNT/file \ > > >> $seqres.full 2>&1 > > switch_cg $cgdir > > $XFS_IO_PROG -c fsync $SCRATCH_MNT/file > > -check_cg $cgdir/$seq-cg $iosize $iosize > > -check_cg $cgdir/$seq-cg-2 0 0 > > +check_cg $cgdir/$seq-cg $iosize $iosize 5% 0 > > And here too? Otherwise the patch LGTM. The same reason as above. If the write tolerance mismatch confuses future readers, I can put a comment to explain. Maybe a single comment like this? ``` # Read and write from a single group. echo "read/write" reset switch_cg $cgdir/$seq-cg $XFS_IO_PROG -c "pread 0 $iosize" -c "pwrite 0 $iosize" -c fsync \ $SCRATCH_MNT/file >> $seqres.full 2>&1 switch_cg $cgdir $XFS_IO_PROG -c fsync $SCRATCH_MNT/file # write bytes is slightly larger than $iosize on ext2 <=== new comment check_cg $cgdir/$seq-cg $iosize $iosize 5% 5% ``` Thanks for the review! -Boyang > Brian > > > +check_cg $cgdir/$seq-cg-2 0 0 0 0 > > > > echo "-io" > $cgdir/cgroup.subtree_control || _fail "subtree control" > > > > -- > > 2.27.0 > > >