On Thu, Sep 19 2024 at 18:50, Jeff Layton wrote: > The fix for this is to establish a floor value for the coarse-grained > clock. When stamping a file with a fine-grained timestamp, we update > the floor value with the current monotonic time (using cmpxchg). Then > later, when a coarse-grained timestamp is requested, check whether the > floor is later than the current coarse-grained time. If it is, then the > kernel will return the floor value (converted to realtime) instead of > the current coarse-grained clock. That allows us to maintain the > ordering guarantees. > > My original implementation of this tracked the floor value in > fs/inode.c (also using cmpxchg), but that caused a performance > regression, mostly due to multiple calls into the timekeeper functions > with seqcount loops. By adding the floor to the timekeeper we can get > that back down to 1 seqcount loop. > > Let me know if you have more questions about this, or suggestions about > how to do this better. The timekeeping code is not my area of expertise > (obviously) so I'm open to doing this a better way if there is one. The comments I made about races and the clock_settime() inconsistency vs. the change log aside, I don't see room for improvement there. What worries me is the atomic_cmpxchg() under contention on large machines, but as it is not a cmpxchg() loop it might be not completely horrible. Thanks, tglx