On Thu, February 14, 2008 8:32 am, Peter Staubach wrote: > > I don't think that this is quite true. If the file is changed > when the NFS server is not running, then the value of i_version > which is used when the NFS server starts up again must be > different than the value which was previously used when the NFS > server was previously running. As I said, the "NFS has seen this i_version" flag needs to be on stable storage, e.g. the lsb of the i_version. This will ensure that any change after NFSD saw the i_version will cause the i_version to be updated. So I think it can provide correct semantics. Precise details: NFSD: when reading i_version take lock tmp = i_version i_version |= 1 drop lock return tmp & ~1; VFS when making any change: take lock if (i_version & 1) { i_version++; changed=1 } drop lock if changed, sync inode > > Is the perceived performance hit really going to be as large > as suspected? We already update the time fields fairly often > and we don't pay a huge penalty for those, or at least not a > penalty that we aren't willing to pay. Has anyone measured > the cost? Correct NFS semantics require that the i_version be written to disk before (or when) the change is committed. That means lots more inodes in the journal. If you are already doing data=journal, it the hit probably isn't too high.(?) You are right: measuring the cost is important. However as we are designing a generic filesystem interface, we need to understand the cost on multiple filesystems in a variety of configuration .... or give the filesystem complete information and let it decide the optimal implementation. Giving the filesystem full information means having an inode_operation "nfsd_reads_version" which returns the number to be used as change_id. NeilBrown - To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html