On Mon, 30 Sept 2024 at 13:57, Matthew Wilcox <willy@xxxxxxxxxxxxx> wrote: > > Could we break out if folio->mapping has changed? Clearly if it has, > we're no longer waiting for the folio we thought we were waiting for, > but for a folio which now belongs to a different file. Sounds like a sane check to me, but it's also not clear that this would make any difference. The most likely reason for starvation I can see is a slow thread (possibly due to cgroup throttling like Christian alluded to) would simply be continually unlucky, because every time it gets woken up, some other thread has already dirtied the data and caused writeback again. I would think that kind of behavior (perhaps some DB transaction header kind of folio) would be more likely than the mapping changing (and then remaining under writeback for some other mapping). But I really don't know. I would much prefer to limit the folio_wait_bit() loop based on something else. For example, the basic reason for that loop (unless there is some other hidden one) is that the folio writeback bit is not atomic wrt the wakeup. Maybe we could *make* it atomic, by simply taking the folio waitqueue lock before clearing the bit? (Only if it has the "waiters" bit set, of course!) Handwavy. Anyway, this writeback handling is nasty. folio_end_writeback() has a big comment about the subtle folio reference issue too, and ignoring that we also have this: if (__folio_end_writeback(folio)) folio_wake_bit(folio, PG_writeback); (which is the cause of the non-atomicity: __folio_end_writeback() will clear the bit, and return the "did we have waiters", and then folio_wake_bit() will get the waitqueue lock and wake people up). And notice how __folio_end_writeback() clears the bit with ret = folio_xor_flags_has_waiters(folio, 1 << PG_writeback); which does that "clear bit and look it it had waiters" atomically. But that function then has a comment that says * This must only be used for flags which are changed with the folio * lock held. For example, it is unsafe to use for PG_dirty as that * can be set without the folio lock held. [...] but the code that uses it here does *NOT* hold the folio lock. I think the comment is wrong, and the code is fine (the important point is that the folio lock _serialized_ the writers, and while clearing doesn't hold the folio lock, you can't clear it without setting it, and setting the writeback flag *does* hold the folio lock). So my point is not that this code is wrong, but that this code is all kinds of subtle and complex. I think it would be good to change the rules so that we serialize with waiters, but being complex and subtle means it sounds all kinds of nasty. Linus