----- 原始邮件 ----- > 发件人: "Dave Chinner" <david@xxxxxxxxxxxxx> > 收件人: "Eric Sandeen" <sandeen@xxxxxxxxxxx> > 抄送: "Eric Sandeen" <sandeen@xxxxxxxxxx>, "Zorro Lang" <zlang@xxxxxxxxxx>, fstests@xxxxxxxxxxxxxxx > 发送时间: 星期三, 2015年 8 月 19日 上午 10:42:16 > 主题: Re: [PATCH] xfs/194: fix the exception when run on 4k sector drives > > On Tue, Aug 18, 2015 at 06:03:45PM -0500, Eric Sandeen wrote: > > On 8/18/15 5:43 PM, Dave Chinner wrote: > > > On Tue, Aug 18, 2015 at 05:33:05PM -0500, Eric Sandeen wrote: > > >> On 8/18/15 5:28 PM, Dave Chinner wrote: > > >>> On Wed, Aug 19, 2015 at 01:21:51AM +0800, Zorro Lang wrote: > > >>>> @@ -50,6 +50,16 @@ rm -f $seqres.full > > >>>> # For this test we use block size = 1/8 page size > > >>>> pgsize=`$here/src/feature -s` > > >>>> blksize=`expr $pgsize / 8` > > >>>> +secsize=`_min_dio_alignment $SCRATCH_DEV` > > >>>> + > > >>>> +# The minimal blksize can't less than sector size, So if > > >>>> +# blksize < secsize, we should adjust blksize and pgsize number. > > >>>> +# Of course, if we adjust pgsize, pgsize won't equal to the > > >>>> +# real page size of system. > > >>>> +if [ $blksize -lt $secsize ];then > > >>>> + blksize=$secsize > > >>>> + pgsize=`expr $blksize \* 8` > > >>>> +fi > > >>> > > >>> No, this is wrong. the page size stays fixed at the machine page > > >>> size. We are testing *sub-page block sizes* here and the sector size > > >>> must be <= page size. Increasing the "page size" to larger than the > > >>> machine page size does not make the kernel use larger page sizes. > > >>> > > >>> IOWs, if you've got sector size = page size (e.g. 4k sector device) > > >>> then no matter what you say $pgsize is, the kernel will see a block > > >>> size = page size test. > > >>> > > >>> This whole chunk of code can simply be replaced with: > > >>> > > >>> blksize=`_min_dio_alignment $SCRATCH_DEV` > > >>> > > >>> Because that's what we actually need to test... > > >> > > >> That won't work either, because we could easily get 512 from that. > > > > > > If 'blockdev --getss $dev' returns 512, then the device supports 512 > > > byte IOs and so it is fine to do 512 byte IOs in the test. > > > > > >> and then this test: > > >> > > >> # Now try the same thing but write a sector in the middle of that hole > > >> # If things go badly stale data will be exposed either side. > > >> # This is most interesting for block size > 512 (page size > 4096) > > >> > > >> # We *should* get: > > >> # |1100|HHHH|33HH|HHHH|2222|----|----|----| > > >> > > >> echo "== Test 4 ==" > > >> xfs_io \ > > >> -c "pwrite -S 0x11 -b $pgsize 0 $pgsize" \ > > >> -c "mmap -r 0 $blksize" -c "mread 0 $blksize" -c "munmap" \ > > >> -c "truncate `expr $blksize / 2`" \ > > >> -c "truncate `expr $blksize + 1`" \ > > >> -c "pwrite -S 0x22 -b $blksize `expr $pgsize / 2` $blksize" \ > > >> -c "pwrite -S 0x33 -b 512 `expr $blksize \* 2` 512" \ > > >> -t -d -f $SCRATCH_MNT/testfile4 >> $seqres.full > > >> > > >> will be impossible. > > >> > > >> AFAICT everything works except for that explicit 512-byte IO. > > > > > > Right. That hard coded 512 needs to change to $blksize, because > > > blksize is now equal to the sector size. I thought this would be > > > obvious to the reader, so I didn't comment on it. > > > > if that last IO is $blksize, and blocksize == sector size, then the > > test won't be testing what it's designed to test here, i.e. a > > sub-block direct IO write. > > That's not what the test is exercising: > > # Test mapping around/over holes for sub-page blocks > > it's testing *sub-page block behaviour*, not sub-block direct IO. > > > > > # We *should* get: > > # |1100|HHHH|33HH|HHHH|2222|----|----|----| > > ^^ > > this > > That implies a sub-block sized direct IO, on a single page that has > 8 blocks. On a 4k page size machine, that is impossible and so most > of the time we are not doing what the comment implies. > > With a 4k page, 512 byte block size: > > | xfs_io \ > | -c "pwrite -S 0x11 -b $pgsize 0 $pgsize" \ > > Write an entire page (4k) > > # |1111|1111|1111|1111|1111|1111|1111|1111| > > | -c "mmap -r 0 $blksize" -c "mread 0 $blksize" -c "munmap" \ > > map the first block (0-511 bytes - one sector) > > | -c "truncate `expr $blksize / 2`" \ > | -c "truncate `expr $blksize + 1`" \ > > sub-block truncate down, sub-block truncate up, make sure page cache > is correctly zeroed. > > # |1100|HHHH|----|----|----|----|----|----| > > | -c "pwrite -S 0x22 -b $blksize `expr $pgsize / 2` $blksize" \ > > DIO write of a single block half way through the original page, make > sure page cache is flushed correctly before DIO. > > # |1100|HHHH|HHHH|HHHH|2222|----|----|----| > > FWIW, this write will fail on a 4k sector device on a 4k page size > platform, because the IO is not sector aligned, and is why the > original patch needed to multiply pgsize out to 8 * sector size.... > > | -c "pwrite -S 0x33 -b 512 `expr $blksize \* 2` 512" \ > > do a -minimum sized write- to the *3rd* block in the page. > > # |1100|HHHH|3333|HHHH|2222|----|----|----| Yes, that's true. if sector size and block size all 512b, we will get this. In my test machine(64k page size, and 4k sector size), this case will mkfs with blksize=8k, and we can get: |1100|HHHH|33HH|HHHH|2222|----|----|----| (one |----| means 8k block size at here:) > > And that matches the expected output. We do not get this output with > a block size that is anything other than pgsize / 8, regardless of > whether the last write is a sub-block DIO or not. > > IOWs, this test assumes that there are at least 8 blocks to page > because to exercise the appropriate paths it needs a hole between > each region that is written. 4k sector/4k page means the kernel > cannot do sub-page block size operations, and the test does not > exercise the code paths we're expecting it to. Hence it may simply > be best to do this: > > if [ $sector_size > $page_size / 8 ]; then > _not_run "sector size too large for platform page size" > fi If so, that'll be simple. I will send V2 patch for review;) Thanks, Zorro > > and replace the hard coded 512 with $sector_size. > > Cheers, > > Dave. > -- > Dave Chinner > david@xxxxxxxxxxxxx > -- To unsubscribe from this list: send the line "unsubscribe fstests" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html