Re: [PATCH] xfs/194: fix the exception when run on 4k sector drives

[Date Prev] [Date Next] [Thread Prev] [Thread Next] [Date Index] [Thread Index]




----- 原始邮件 -----
> 发件人: "Dave Chinner" <david@xxxxxxxxxxxxx>
> 收件人: "Eric Sandeen" <sandeen@xxxxxxxxxxx>
> 抄送: "Eric Sandeen" <sandeen@xxxxxxxxxx>, "Zorro Lang" <zlang@xxxxxxxxxx>, fstests@xxxxxxxxxxxxxxx
> 发送时间: 星期三, 2015年 8 月 19日 上午 10:42:16
> 主题: Re: [PATCH] xfs/194: fix the exception when run on 4k sector drives
> 
> On Tue, Aug 18, 2015 at 06:03:45PM -0500, Eric Sandeen wrote:
> > On 8/18/15 5:43 PM, Dave Chinner wrote:
> > > On Tue, Aug 18, 2015 at 05:33:05PM -0500, Eric Sandeen wrote:
> > >> On 8/18/15 5:28 PM, Dave Chinner wrote:
> > >>> On Wed, Aug 19, 2015 at 01:21:51AM +0800, Zorro Lang wrote:
> > >>>> @@ -50,6 +50,16 @@ rm -f $seqres.full
> > >>>>  # For this test we use block size = 1/8 page size
> > >>>>  pgsize=`$here/src/feature -s`
> > >>>>  blksize=`expr $pgsize / 8`
> > >>>> +secsize=`_min_dio_alignment $SCRATCH_DEV`
> > >>>> +
> > >>>> +# The minimal blksize can't less than sector size, So if
> > >>>> +# blksize < secsize, we should adjust blksize and pgsize number.
> > >>>> +# Of course, if we adjust pgsize, pgsize won't equal to the
> > >>>> +# real page size of system.
> > >>>> +if [ $blksize -lt $secsize ];then
> > >>>> +        blksize=$secsize
> > >>>> +        pgsize=`expr $blksize \* 8`
> > >>>> +fi
> > >>>
> > >>> No, this is wrong. the page size stays fixed at the machine page
> > >>> size. We are testing *sub-page block sizes* here and the sector size
> > >>> must be <= page size. Increasing the "page size" to larger than the
> > >>> machine page size does not make the kernel use larger page sizes.
> > >>>
> > >>> IOWs, if you've got sector size = page size (e.g. 4k sector device)
> > >>> then no matter what you say $pgsize is, the kernel will see a block
> > >>> size = page size test.
> > >>>
> > >>> This whole chunk of code can simply be replaced with:
> > >>>
> > >>> blksize=`_min_dio_alignment $SCRATCH_DEV`
> > >>>
> > >>> Because that's what we actually need to test...
> > >>
> > >> That won't work either, because we could easily get 512 from that.
> > > 
> > > If 'blockdev --getss $dev' returns 512, then the device supports 512
> > > byte IOs and so it is fine to do 512 byte IOs in the test.
> > > 
> > >> and then this test:
> > >>
> > >> # Now try the same thing but write a sector in the middle of that hole
> > >> # If things go badly stale data will be exposed either side.
> > >> # This is most interesting for block size > 512 (page size > 4096)
> > >>
> > >> # We *should* get:
> > >> # |1100|HHHH|33HH|HHHH|2222|----|----|----|
> > >>
> > >> echo "== Test 4 =="
> > >> xfs_io \
> > >> -c "pwrite -S 0x11 -b $pgsize 0 $pgsize" \
> > >> -c "mmap -r 0 $blksize" -c "mread 0 $blksize" -c "munmap" \
> > >> -c "truncate `expr $blksize / 2`" \
> > >> -c "truncate `expr $blksize + 1`" \
> > >> -c "pwrite -S 0x22 -b $blksize `expr $pgsize / 2` $blksize" \
> > >> -c "pwrite -S 0x33 -b 512 `expr $blksize \* 2` 512" \
> > >> -t -d -f $SCRATCH_MNT/testfile4 >> $seqres.full
> > >>
> > >> will be impossible.
> > >>
> > >> AFAICT everything works except for that explicit 512-byte IO.
> > > 
> > > Right. That hard coded 512 needs to change to $blksize, because
> > > blksize is now equal to the sector size. I thought this would be
> > > obvious to the reader, so I didn't comment on it.
> > 
> > if that last IO is $blksize, and blocksize == sector size, then the
> > test won't be testing what it's designed to test here, i.e. a
> > sub-block direct IO write.
> 
> That's not what the test is exercising:
> 
> # Test mapping around/over holes for sub-page blocks
> 
> it's testing *sub-page block behaviour*, not sub-block direct IO.
> 
> > 
> > # We *should* get:
> > # |1100|HHHH|33HH|HHHH|2222|----|----|----|
> >              ^^
> >              this
> 
> That implies a sub-block sized direct IO, on a single page that has
> 8 blocks. On a 4k page size machine, that is impossible and so most
> of the time we are not doing what the comment implies.
> 
> With a 4k page, 512 byte block size:
> 
> | xfs_io \
> | -c "pwrite -S 0x11 -b $pgsize 0 $pgsize" \
> 
> Write an entire page (4k)
> 
> # |1111|1111|1111|1111|1111|1111|1111|1111|
> 
> | -c "mmap -r 0 $blksize" -c "mread 0 $blksize" -c "munmap" \
> 
> map the first block (0-511 bytes - one sector)
> 
> | -c "truncate `expr $blksize / 2`" \
> | -c "truncate `expr $blksize + 1`" \
> 
> sub-block truncate down, sub-block truncate up, make sure page cache
> is correctly zeroed.
> 
> # |1100|HHHH|----|----|----|----|----|----|
> 
> | -c "pwrite -S 0x22 -b $blksize `expr $pgsize / 2` $blksize" \
> 
> DIO write of a single block half way through the original page, make
> sure page cache is flushed correctly before DIO.
> 
> # |1100|HHHH|HHHH|HHHH|2222|----|----|----|
> 
> FWIW, this write will fail on a 4k sector device on a 4k page size
> platform, because the IO is not sector aligned, and is why the
> original patch needed to multiply pgsize out to 8 * sector size....
> 
> | -c "pwrite -S 0x33 -b 512 `expr $blksize \* 2` 512" \
> 
> do a -minimum sized write- to the *3rd* block in the page.
> 
> # |1100|HHHH|3333|HHHH|2222|----|----|----|

Yes, that's true. if sector size and block size all 512b, we will get this.

In my test machine(64k page size, and 4k sector size), this case will
mkfs with blksize=8k, and we can get:

|1100|HHHH|33HH|HHHH|2222|----|----|----| (one |----| means 8k block size at here:)

> 
> And that matches the expected output. We do not get this output with
> a block size that is anything other than pgsize / 8, regardless of
> whether the last write is a sub-block DIO or not.
> 
> IOWs, this test assumes that there are at least 8 blocks to page
> because to exercise the appropriate paths it needs a hole between
> each region that is written.  4k sector/4k page means the kernel
> cannot do sub-page block size operations, and the test does not
> exercise the code paths we're expecting it to. Hence it may simply
> be best to do this:
> 
> if [ $sector_size > $page_size / 8 ]; then
> 	_not_run "sector size too large for platform page size"
> fi

If so, that'll be simple. I will send V2 patch for review;)

Thanks,
Zorro

> 
> and replace the hard coded 512 with $sector_size.
> 
> Cheers,
> 
> Dave.
> --
> Dave Chinner
> david@xxxxxxxxxxxxx
> 
--
To unsubscribe from this list: send the line "unsubscribe fstests" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux Filesystems Development]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux