Re: [PATCH v3 2/2] 9p: use i_lock to protect update of inode->i_blocks under 32-bit

Hou Tao <houtao1@xxxxxxxxxx> · Thu, 24 Jan 2019 22:03:30 +0800



Hi,

On 2019/1/24 15:25, Dominique Martinet wrote:
> Hou Tao wrote on Thu, Jan 24, 2019:
>> When v9fs_stat2inode() is invoked concurrently under 32-bit case
>> and CONFIG_LBDAF is enabled, multiple updates of inode->i_blocks
>> may corrupt the value of i_blocks because the assignment of 64-bit
>> value under 32-bit host is not atomic.
>>
>> Fix it by using i_lock to protect update of inode->i_blocks. Also
>> Skip the update when it's not requested.
>>
>> Suggested-by: Dominique Martinet <asmadeus@xxxxxxxxxxxxx>
>> Signed-off-by: Hou Tao <houtao1@xxxxxxxxxx>
>> ---
>>  fs/9p/v9fs_vfs.h       | 10 ++++++++++
>>  fs/9p/vfs_inode.c      |  7 ++++---
>>  fs/9p/vfs_inode_dotl.c | 16 +++++++++-------
>>  3 files changed, 23 insertions(+), 10 deletions(-)
>>
>> diff --git a/fs/9p/v9fs_vfs.h b/fs/9p/v9fs_vfs.h
>> index 3d371b9e461a..0bd4acfdb6af 100644
>> --- a/fs/9p/v9fs_vfs.h
>> +++ b/fs/9p/v9fs_vfs.h
>> @@ -101,4 +101,14 @@ static inline void v9fs_i_size_write(struct inode *inode, loff_t i_size)
>>  	if (sizeof(i_size) > sizeof(long))
>>  		spin_unlock(&inode->i_lock);
>>  }
>> +
>> +static inline void v9fs_i_blocks_write(struct inode *inode, blkcnt_t i_blocks)
>> +{
>> +	/* Avoid taking i_lock under 64-bits */
>> +	if (sizeof(i_blocks) > sizeof(long))
>> +		spin_lock(&inode->i_lock);
>> +	inode->i_blocks = i_blocks;
>> +	if (sizeof(i_blocks) > sizeof(long))
>> +		spin_unlock(&inode->i_lock);
>> +}
>>  #endif
>> diff --git a/fs/9p/vfs_inode.c b/fs/9p/vfs_inode.c
>> index 72b779bc0942..f05ff3cfbfe7 100644
>> --- a/fs/9p/vfs_inode.c
>> +++ b/fs/9p/vfs_inode.c
>> @@ -1218,10 +1218,11 @@ v9fs_stat2inode(struct p9_wstat *stat, struct inode *inode,
>>  	mode |= inode->i_mode & ~S_IALLUGO;
>>  	inode->i_mode = mode;
>>  
>> -	if (!(flags & V9FS_STAT2INODE_KEEP_ISIZE))
>> +	if (!(flags & V9FS_STAT2INODE_KEEP_ISIZE)) {
>>  		v9fs_i_size_write(inode, stat->length);
>> -	/* not real number of blocks, but 512 byte ones ... */
>> -	inode->i_blocks = (stat->length + 512 - 1) >> 9;
>> +		/* not real number of blocks, but 512 byte ones ... */
>> +		v9fs_i_blocks_write(inode, (stat->length + 512 - 1) >> 9);
> 
> hmm, part of me wants to abuse v9fs_i_size_write to also update i_blocks
> as I think we'd usually want to keep both in sync.We can do that for v9fs_stat2inode(), but for v9fs_stat2inode_dotl()
the updates of i_block & i_size are controlled by different flags, so
separated updates are needed (maybe also add an extra flags ?).

> 
> In v9fs_file_write_iter we do not update i_blocks directly but
> inode_add_bytes basically does the same thing... Speaking of which it's
> assuming inode->i_bytes is set correctly initially; I _think_ we're
> supposed to call inode_set_bytes() instead of writing i_blocks directly
> in stat2inode?
> Sorry, I'm not too familiar with this code and there doesn't seem to be
> many users of inode_set_bytes so it's a bit puzzling to me.
> 
In my understanding, now only disk quota uses i_bytes, so for 9p there is not so
much different between updating i_blocks directly and using inode_set_bytes(),
but maybe using inode_set_bytes() will be appropriate.


> I'm not using cache mode much but I have some very weird behaviour
> currently if I mount something with cache=fscache :
>  - if I create a new file from 9p and stat on the same client, a size of
>  0-511 has block = 0 then it's just always off by 1
Is this not the expected result ? v9fs_write_end() will call inode_add_bytes(inode,511)
if we write 511 bytes to the new file and i_blocks will be 0.

>  - if I create a file outside of 9p and stat it, for example starting
>  with 400 bytes, it starts with 8 blocks then the number of blocks
>  increase by 1 everytime 512 bytes are written regardless of initial
>  size.
> 
The initial i_blocks comes from the underlying filesystem, so I think 8 blocks
just mean the block size of the underlying filesystem is 4KB. If another 512 bytes
are written into the file, inode_add_bytes() will increase i_blocks by one instead of
fetching i_blocks from server, though i_blocks in server side will still be 8. Maybe
we should call v9fs_invalidate_inode_attr() in v9fs_write_end() ?

> Let's try to go back to some decent behaviour there first, and then
> we can decide if it still make sense to separate v9fs_i_blocks_write and
> v9fs_i_size_write or not.
> 
I don't follow that. Could you elaborate ?

Regards,
Tao