On Fri, 13 Dec 2019 at 15:59, Jan Kara <jack@xxxxxxx> wrote: > > Hello! > > On Tue 19-11-19 08:47:31, Paul Richards wrote: > > I'm trying to understand the interaction between the ext4 `commit` > > interval option, and the `vm.dirty_expire_centisecs` tuneable. > > > > The ext4 `commit` documentation says: > > > > > Ext4 can be told to sync all its data and metadata every 'nrsec' seconds. The default value is 5 seconds. This means that if you lose your power, you will lose as much as the latest 5 seconds of work (your filesystem will not be damaged though, thanks to the journaling). > > > > The `dirty_expire_centisecs` documentation says: > > > > > This tunable is used to define when dirty data is old enough to be eligible for writeout by the kernel flusher threads. It is expressed in 100'ths of a second. Data which has been dirty in-memory for longer than this interval will be written out next time a flusher thread wakes up. > > > > > > Superficially these sound like they have a very similar effect. They > > periodically flush out data that hasn't been explicitly fsync'd by the > > application. I'd like to understand a bit more the interaction > > between these. > > Yes, the effect is rather similar but not quite the same. The first thing > to observe is kind of obvious fact that ext4 commit interval influences > just the particular filesystem while dirty_expire_centisecs influences > behavior of global writeback over all filesystems. > > Secondly, commit interval is really the maximum age of ext4 transation. So > if there is metadata change pending in the journal, it will become > persistent at latest after this time. So for say 'mkdir' that will be > persistent at latest after this time. For data operations things are more > complex. E.g. when delayed allocation is used (which is the default), the > change gets logged in the journal only during writeback. So it can take up > to dirty_expire_centisecs for data to be written back from page cache, that > results in filesystem journalling block allocations etc. and then it can > take upto commit interval for these changes to become persistent. So in > this case the intervals add up. There are also other special cases > somewhere in between but generally it is reasonable to assume that data gets > automatically persistent in dirty_expire_centisecs + commit_interval time. > Note both these times are actually times when writeback is triggered so > if the disk gets too busy, the actual time when data is completely on disk > may be much higher. > Thanks for taking the time to reply! Since automatic persisting of data occurs only after dirty_expire_centisecs + commit_interval, should the ext4 docs be corrected? They currently state (for the commit interval option): "The default value is 5 seconds. This means that if you lose your power, you will lose as much as the latest 5 seconds of work"