On Tue 17-12-19 14:42:48, Paul Richards wrote: > On Fri, 13 Dec 2019 at 15:59, Jan Kara <jack@xxxxxxx> wrote: > > > > Hello! > > > > On Tue 19-11-19 08:47:31, Paul Richards wrote: > > > I'm trying to understand the interaction between the ext4 `commit` > > > interval option, and the `vm.dirty_expire_centisecs` tuneable. > > > > > > The ext4 `commit` documentation says: > > > > > > > Ext4 can be told to sync all its data and metadata every 'nrsec' seconds. The default value is 5 seconds. This means that if you lose your power, you will lose as much as the latest 5 seconds of work (your filesystem will not be damaged though, thanks to the journaling). > > > > > > The `dirty_expire_centisecs` documentation says: > > > > > > > This tunable is used to define when dirty data is old enough to be eligible for writeout by the kernel flusher threads. It is expressed in 100'ths of a second. Data which has been dirty in-memory for longer than this interval will be written out next time a flusher thread wakes up. > > > > > > > > > Superficially these sound like they have a very similar effect. They > > > periodically flush out data that hasn't been explicitly fsync'd by the > > > application. I'd like to understand a bit more the interaction > > > between these. > > > > Yes, the effect is rather similar but not quite the same. The first thing > > to observe is kind of obvious fact that ext4 commit interval influences > > just the particular filesystem while dirty_expire_centisecs influences > > behavior of global writeback over all filesystems. > > > > Secondly, commit interval is really the maximum age of ext4 transation. So > > if there is metadata change pending in the journal, it will become > > persistent at latest after this time. So for say 'mkdir' that will be > > persistent at latest after this time. For data operations things are more > > complex. E.g. when delayed allocation is used (which is the default), the > > change gets logged in the journal only during writeback. So it can take up > > to dirty_expire_centisecs for data to be written back from page cache, that > > results in filesystem journalling block allocations etc. and then it can > > take upto commit interval for these changes to become persistent. So in > > this case the intervals add up. There are also other special cases > > somewhere in between but generally it is reasonable to assume that data gets > > automatically persistent in dirty_expire_centisecs + commit_interval time. > > Note both these times are actually times when writeback is triggered so > > if the disk gets too busy, the actual time when data is completely on disk > > may be much higher. > > > > Thanks for taking the time to reply! > > Since automatic persisting of data occurs only after > dirty_expire_centisecs + commit_interval, > should the ext4 docs be corrected? They currently state (for the > commit interval option): > > "The default value is 5 seconds. This means that if you lose > your power, you will lose as much as the latest 5 seconds of work" Yes, probably that should be clarified. Where did you find this wording? Because my ext4 manpage just states: commit=nrsec Start a journal commit every nrsec seconds. The default value is 5 seconds. Zero means default. Honza -- Jan Kara <jack@xxxxxxxx> SUSE Labs, CR