On 21 Jul 2016, at 9:20, Trond Myklebust wrote:
On Jul 21, 2016, at 09:05, Benjamin Coddington <bcodding@xxxxxxxxxx>
wrote:
So back to Christoph's point earlier:
On 17 Jul 2016, at 23:48, Christoph Hellwig wrote:
This one breaks xfstests generic/207 on block/scsi layout for me.
The
reason for that is that we need a layoutcommit after writing out all
data for the file for the file size to be updated on the server.
You responded:
On 18 Jul 2016, at 0:32, Trond Myklebust wrote:
I’m not understanding this argument. Why do we care if the file
size is up
to date on the server if we’re not sending an actual GETATTR on
the wire
to retrieve the file size?
I guess the answer might be because we can get it back from the last
LAYOUTCOMMIT.
The patch that I followed up with should now ensure that we do not
mark the attribute cache as up to date if there is a LAYOUTCOMMIT
pending.
IOW: when the pNFS write is done, it is expected to do 2 things:
1) mark the inode for LAYOUTCOMMIT
2) mark the attribute cache as invalid (because we know the change
attribute, mtime, ctime need to be updates)
In the case of blocks pNFS write:
The call to pnfs_set_layoutcommit() in pnfs_ld_write_done() should
take care of (1)
The call to nfs_writeback_update_inode() in nfs4_write_done_cb()
should take care of (2).
Provided that these 2 calls are performed in the above order, then any
call to nfs_getattr() which has not been preceded by a call to
nfs4_proc_layoutcommit() should trigger the call to
__nfs_revalidate_inode().
I think the problem is that a following nfs_getattr() will fail to
notice
the size change in the case of a write_completion and layoutcommit
occuring
after nfs_getattr() has done pnfs_sync_inode() but before it has done
nfs_update_inode().
In the failing case there are two threads one is doing writes, the other
doing lstat on aio_complete via io_getevents(2).
For each write completion the lstat thread tries to verify the file
size.
GETATTR Thread LAYOUTCOMMIT Thread
-------------- --------------------
write_completion sets LAYOUTCOMMIT
(4096@0)
--> nfs_getattr
__nfs_revalidate_inode
pnfs_sync_inode
getattr sees 4096
write_completion sets LAYOUTCOMMIT
(4096@4096)
sets LAYOUTCOMMITING
clears LAYOUTCOMMIT
clears LAYOUTCOMMITTING
nfs_refresh_inode
nfs_update_inode size is 4096
<-- nfs_getattr
At this point the cached attributes are seen as up to date, but
aio-dio-extend-stat program expects that second write_completion to
reflect
in the file size.
Ben
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html