On Tue, Jan 05, 2016 at 06:36:04PM +0100, Jan Kara wrote: > Hi, > > On Mon 04-01-16 17:22:19, Dave Chinner wrote: > > I've been looking at implementing the lazytime mount option for XFS, > > and I'm struggling to work out what it is supposed to mean. > > > > AFAICT, on ext4, lazytime means that pure timestamp updates are not > > journalled and they are only ever written back when the inode is > > otherwise dirtied and written, or they are timestamp dirty for 24 > > hours which triggers writeback. > > > > This poses a couple of problems for XFS: > > > > 1. we log every timestamp change, so there is no mechanism > > for delayed/deferred update. > > > > 2. we track dirty metadata in the journal, not via the VFS > > dirty inode lists, so all the infrastructure written for > > ext4 to do periodic flushing is useless to us. > > > > These are solvable problems, but what I'm not sure about is exactly > > what the intended semantics of lazytime durability are. That is, > > exactly what guaranteed are we giving userspace about timestamp > > updates when lazytime is used? The guarantees we have to give will > > greatly influence the XFS implementation, so I really need to nail > > down what we are expected to provide userspace. Can we: > > > > a) just ignore all durability concerns? > > b) if not, do we only need to care about the 24 hour > > writeback and unmount? > > c) if not, are fsync/sync/syncfs/freeze/unmount supposed > > to provide durability of all metadata changes? > > d) do we have to care about ordering - if we fsync one inode > > with 1 hour old timestamps, do we also need to guarantee > > that all the inodes with older dirty timestamps also get > > made durable? > > So the intended semantics is: > 1) fsync / sync / freeze / unmount will write the timestamp updates even > with lazytime. So unless crash happens, timestamps are guaranteed to be > consistent. Also sync / fsync guarantees all changes to get to disk. > 2) We periodically write back timestamps (once per 24 hours) to avoid too > big timestamp inconsistencies in case of crash. Ok, so it's supposed to be a delayed timestamp update mechanism without any specific ordering guarantees, not an opportunistic timestamp update mechanism. I can work with that. Cheers, Dave. -- Dave Chinner david@xxxxxxxxxxxxx -- To unsubscribe from this list: send the line "unsubscribe linux-api" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html