Re: [PATCH 5/5] aio: add support for file based polled IO

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 11/18/18 6:57 PM, Dave Chinner wrote:
> On Sat, Nov 17, 2018 at 04:53:17PM -0700, Jens Axboe wrote:
>> Needs further work, but this should work fine on normal setups
>> with a file system on a pollable block device.
> 
> What should work fine? I've got no idea what this patch is actually
> providing....

Sorry, should have made it more clear that this one is just a test
patch to enable polled aio for files, it's not supposed to be
considered complete.

>> Signed-off-by: Jens Axboe <axboe@xxxxxxxxx>
>> ---
>>  fs/aio.c       | 2 ++
>>  fs/direct-io.c | 4 +++-
>>  fs/iomap.c     | 7 +++++--
>>  3 files changed, 10 insertions(+), 3 deletions(-)
>>
>> diff --git a/fs/aio.c b/fs/aio.c
>> index 500da3ffc376..e02085fe10d7 100644
>> --- a/fs/aio.c
>> +++ b/fs/aio.c
>> @@ -1310,6 +1310,8 @@ static struct block_device *aio_bdev_host(struct kiocb *req)
>>  
>>  	if (S_ISBLK(inode->i_mode))
>>  		return I_BDEV(inode);
>> +	else if (inode->i_sb && inode->i_sb->s_bdev)
>> +		return inode->i_sb->s_bdev;
> 
> XFS might be doing AIO to files on real-time device, not
> inode->i_sb->s_bdev. So this may well be the wrong block device
> for the IO being submitted.

See patch description, XFS doing AIO on a real-time device is not a
"normal setup".

It doesn't work for btrfs either, for instance. As far as I can tell, there
are a few routes that we can go here:

1) Have an fs ops that returns the bdev.
2) Set it after submit time, like we do in other spots. This won't work
   if we straddle devices.

I don't feel that strongly about it, and frankly I'm fine with just
dropping non-bdev support for now. The patch is provided for testing
for the 99% of use cases, which is a file system on top of a single
device.

>> diff --git a/fs/iomap.c b/fs/iomap.c
>> index 74c1f37f0fd6..4cf412b6230a 100644
>> --- a/fs/iomap.c
>> +++ b/fs/iomap.c
>> @@ -1555,6 +1555,7 @@ iomap_dio_zero(struct iomap_dio *dio, struct iomap *iomap, loff_t pos,
>>  	struct page *page = ZERO_PAGE(0);
>>  	int flags = REQ_SYNC | REQ_IDLE;
>>  	struct bio *bio;
>> +	blk_qc_t qc;
>>  
>>  	bio = bio_alloc(GFP_KERNEL, 1);
>>  	bio_set_dev(bio, iomap->bdev);
>> @@ -1570,7 +1571,9 @@ iomap_dio_zero(struct iomap_dio *dio, struct iomap *iomap, loff_t pos,
>>  	bio_set_op_attrs(bio, REQ_OP_WRITE, flags);
>>  
>>  	atomic_inc(&dio->ref);
>> -	return submit_bio(bio);
>> +	qc = submit_bio(bio);
>> +	WRITE_ONCE(dio->iocb->ki_blk_qc, qc);
>> +	return qc;
>>  }
> 
> Why is this added to sub-block zeroing IO calls? It gets overwritten
> by the data IO submission, so this value is going to change as the
> IO progresses. What does making these partial IOs visible provide,
> especially as they then get overwritten by the next submissions?
> Indeed, how does one wait on all IOs in the DIO to complete if we
> are only tracking one of many?

That does look like a mistake, the zeroing should just be ignored.
We only care about the actual IO.

-- 
Jens Axboe




[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [Samba]     [Device Mapper]     [CEPH Development]

  Powered by Linux