Hi, On 2019/1/24 15:25, Dominique Martinet wrote: > Hou Tao wrote on Thu, Jan 24, 2019: >> When v9fs_stat2inode() is invoked concurrently under 32-bit case >> and CONFIG_LBDAF is enabled, multiple updates of inode->i_blocks >> may corrupt the value of i_blocks because the assignment of 64-bit >> value under 32-bit host is not atomic. >> >> Fix it by using i_lock to protect update of inode->i_blocks. Also >> Skip the update when it's not requested. >> >> Suggested-by: Dominique Martinet <asmadeus@xxxxxxxxxxxxx> >> Signed-off-by: Hou Tao <houtao1@xxxxxxxxxx> >> --- >> fs/9p/v9fs_vfs.h | 10 ++++++++++ >> fs/9p/vfs_inode.c | 7 ++++--- >> fs/9p/vfs_inode_dotl.c | 16 +++++++++------- >> 3 files changed, 23 insertions(+), 10 deletions(-) >> >> diff --git a/fs/9p/v9fs_vfs.h b/fs/9p/v9fs_vfs.h >> index 3d371b9e461a..0bd4acfdb6af 100644 >> --- a/fs/9p/v9fs_vfs.h >> +++ b/fs/9p/v9fs_vfs.h >> @@ -101,4 +101,14 @@ static inline void v9fs_i_size_write(struct inode *inode, loff_t i_size) >> if (sizeof(i_size) > sizeof(long)) >> spin_unlock(&inode->i_lock); >> } >> + >> +static inline void v9fs_i_blocks_write(struct inode *inode, blkcnt_t i_blocks) >> +{ >> + /* Avoid taking i_lock under 64-bits */ >> + if (sizeof(i_blocks) > sizeof(long)) >> + spin_lock(&inode->i_lock); >> + inode->i_blocks = i_blocks; >> + if (sizeof(i_blocks) > sizeof(long)) >> + spin_unlock(&inode->i_lock); >> +} >> #endif >> diff --git a/fs/9p/vfs_inode.c b/fs/9p/vfs_inode.c >> index 72b779bc0942..f05ff3cfbfe7 100644 >> --- a/fs/9p/vfs_inode.c >> +++ b/fs/9p/vfs_inode.c >> @@ -1218,10 +1218,11 @@ v9fs_stat2inode(struct p9_wstat *stat, struct inode *inode, >> mode |= inode->i_mode & ~S_IALLUGO; >> inode->i_mode = mode; >> >> - if (!(flags & V9FS_STAT2INODE_KEEP_ISIZE)) >> + if (!(flags & V9FS_STAT2INODE_KEEP_ISIZE)) { >> v9fs_i_size_write(inode, stat->length); >> - /* not real number of blocks, but 512 byte ones ... */ >> - inode->i_blocks = (stat->length + 512 - 1) >> 9; >> + /* not real number of blocks, but 512 byte ones ... */ >> + v9fs_i_blocks_write(inode, (stat->length + 512 - 1) >> 9); > > hmm, part of me wants to abuse v9fs_i_size_write to also update i_blocks > as I think we'd usually want to keep both in sync.We can do that for v9fs_stat2inode(), but for v9fs_stat2inode_dotl() the updates of i_block & i_size are controlled by different flags, so separated updates are needed (maybe also add an extra flags ?). > > In v9fs_file_write_iter we do not update i_blocks directly but > inode_add_bytes basically does the same thing... Speaking of which it's > assuming inode->i_bytes is set correctly initially; I _think_ we're > supposed to call inode_set_bytes() instead of writing i_blocks directly > in stat2inode? > Sorry, I'm not too familiar with this code and there doesn't seem to be > many users of inode_set_bytes so it's a bit puzzling to me. > In my understanding, now only disk quota uses i_bytes, so for 9p there is not so much different between updating i_blocks directly and using inode_set_bytes(), but maybe using inode_set_bytes() will be appropriate. > I'm not using cache mode much but I have some very weird behaviour > currently if I mount something with cache=fscache : > - if I create a new file from 9p and stat on the same client, a size of > 0-511 has block = 0 then it's just always off by 1 Is this not the expected result ? v9fs_write_end() will call inode_add_bytes(inode,511) if we write 511 bytes to the new file and i_blocks will be 0. > - if I create a file outside of 9p and stat it, for example starting > with 400 bytes, it starts with 8 blocks then the number of blocks > increase by 1 everytime 512 bytes are written regardless of initial > size. > The initial i_blocks comes from the underlying filesystem, so I think 8 blocks just mean the block size of the underlying filesystem is 4KB. If another 512 bytes are written into the file, inode_add_bytes() will increase i_blocks by one instead of fetching i_blocks from server, though i_blocks in server side will still be 8. Maybe we should call v9fs_invalidate_inode_attr() in v9fs_write_end() ? > Let's try to go back to some decent behaviour there first, and then > we can decide if it still make sense to separate v9fs_i_blocks_write and > v9fs_i_size_write or not. > I don't follow that. Could you elaborate ? Regards, Tao