On Sat, Feb 24, 2024 at 11:11:28AM -0800, Linus Torvalds wrote: > But it is possible that this work never went anywhere exactly because > this is such a rare case. That kind of "write so much that you want to > do something special" is often such a special thing that using > O_DIRECT is generally the trivial solution. Well, actually there's a relatively common workload where we do this exact same thing --- and that's when we run mkfs.ext[234] / mke2fs. We issue a huge number of buffered writes (at least, if the device doesn't support a zeroing discard operation) to zero out the inode table. We rely on the mm subsystem putting mke2fs "into the penalty box", or else some process (usually mke2fs) will get OOM-killed. I don't consider it a "penalty" --- in fact, when write throttling doesn't work, I've complained that it's an mm bug. (Sometimes this has broken when the mke2fs process runs out of physical memory, and sometimes it has broken when the mke2fs runs into the memory cgroup limit; it's one of those things that's seems to break every 3-5 years.) But still, it's something which *must* work, because it's really not reasonable for userspace to know what is a reasonable rate to self-throttling buffered writes --- it's something the kernel should do for the userspace process. Because this is something that has broken more than once, we have two workarounds in mke2fs; one is that we can call fsync(2) every N block group's worth of inode tables, which is kind of a hack, and the other is that we can use Direct I/O. But using DIO has a worse user experience (well, unless the alternative is mke2fs getting OOM-killed; admittedly that's worse) than just using buffered I/O, since we generally don't need to synchronously wait for the write requests to complete. Neither is enabled by default, because in my view, this is something the mm should just get right, darn it. In any case, I definitely don't consider write throttled to be a performance "problem" --- it's actually a far worse problem when the throttling doesn't happen, because it generally means someone is getting OOM-killed. - Ted