Hi, On Wed, Apr 26, 2023 at 8:14 AM Matthew Wilcox <willy@xxxxxxxxxxxxx> wrote: > > I'm not generally a fan of lock-with-timeout approaches. I think the > rationale for this one makes sense, but we're going to see some people > try to use this for situations where it doesn't make sense. Although it won't completely prevent the issue, I could add a comment to the function (and the similar lock_buffer_timeout() added in patch #2 [1] at least warning people that it's discouraged to use the function without very careful consideration. [1] https://lore.kernel.org/r/20230421151135.v2.2.Ie146eec4d41480ebeb15f0cfdfb3bc9095e4ebd9@changeid/ > I almost > wonder if we shouldn't spin rather than sleep on this lock, since the > window of time we're willing to wait is so short. It doesn't feel like spinning is the right move in this particular case. While we want to enable kcompactd to move forward, it's not so urgent that it needs to take up lots of CPU cycles doing so. ...and, in fact, the cases I've seen us be blocked is when we're under memory pressure and spinning would be counterproductive to getting out of that pressure. > I'm certainly not > willing to NAK this patch since it's clearly fixing a real problem. > > Hm. If the problem is that we want to wait for the lock unless the > lock is being held for I/O, we can actually tell that in the caller. > > if (folio_test_uptodate(folio)) > folio_lock(folio); > else > folio_trylock(folio); > > (the folio lock isn't held for writeback, just taken and released; > if the folio is uptodate, the folio lock should only be taken for a > short time; if it's !uptodate then it's probably being read) The current place in patch #3 where I'm using folio_lock_timeout() only calls it if a folio_trylock() already failed [2]. So I guess the idea would be that if the trylock failed and folio_test_uptodate() returns 0 then we immediately fail, otherwise we call the unbounded folio_trylock()? I put some traces in and ran my test and it turns out that in every case (except one) where the tre initial folio_trylock() failed I saw folio_test_uptodate() return 0. Assuming my test case is typical, I think that means that coding it with folio_test_uptodate() is roughly the same as just never waiting at all for the folio lock in the SYNC_LIGHT case. In the original discussion of my v1 patch people didn't like that idea. ...so I think that for now I'm going to keep it with the timeout flow. -- After all of the discussion and continued digging into the code that I've done, I'm still of the opinion that this patch series is worthwhile and in the spirit of the SYNC_LIGHT mode of migration, but I also don't believe it's going to be a panacea for any particular case. Presumably even if kcompactd gets blocked for a second or two under a lot of memory pressure it won't be the absolute end of the world. In that spirit, I'll plan to post v3 in a day or two, but I'll also continue to say that if someone tells me that they really hate it that I can put it on the back burner. [2] https://lore.kernel.org/r/20230421151135.v2.3.Ia86ccac02a303154a0b8bc60567e7a95d34c96d3@changeid/