Doug Anderson <dianders@xxxxxxxxxxxx> writes: > Hi, > > On Thu, Apr 13, 2023 at 8:10 PM Huang, Ying <ying.huang@xxxxxxxxx> wrote: >> >> Douglas Anderson <dianders@xxxxxxxxxxxx> writes: >> >> > Currently when we try to do page migration and we're in "synchronous" >> > mode (and not doing direct compaction) then we'll wait an infinite >> > amount of time for a page lock. This does not appear to be a great >> > idea. >> > >> > One issue can be seen when I put a device under extreme memory >> > pressure. I took a sc7180-trogdor Chromebook (4GB RAM, 8GB zram >> > swap). I ran the browser along with Android (which runs from a >> > loopback mounted 128K block-size squashfs "disk"). I then manually ran >> > the mmm_donut memory pressure tool [1]. The system is completely >> > unusable both with and without this patch since there are 8 processes >> > completely thrashing memory, but it was still interesting to look at >> > how migration was behaving. I put some timing code in and I could see >> > that we sometimes waited over 25 seconds (in the context of >> > kcompactd0) for a page lock to become available. Although the 25 >> > seconds was the high mark, it was easy to see tens, hundreds, or >> > thousands of milliseconds spent waiting on the lock. >> > >> > Instead of waiting, if I bailed out right away (as this patch does), I >> > could see kcompactd0 move forward to successfully to migrate other >> > pages instead. This seems like a better use of kcompactd's time. >> > >> > Thus, even though this didn't make the system any more usable in my >> > absurd test case, it still seemed to make migration behave better and >> > that feels like a win. It also makes the code simpler since we have >> > one fewer special case. >> >> TBH, the test case is too extreme for me. > > That's fair. That being said, I guess the point I was trying to make > is that waiting for this lock could take an unbounded amount of time. > Other parts of the system sometimes hold a page lock and then do a > blocking operation. At least in the case of kcompactd there are better > uses of its time than waiting for any given page. > >> And, we have multiple "sync" mode to deal with latency requirement, for >> example, we use MIGRATE_SYNC_LIGHT for compaction to avoid too long >> latency. If you have latency requirement for some users, you may >> consider to add new "sync" mode. > > Sure. kcompactd_do_work() is currently using MIGRATE_SYNC_LIGHT. I > guess my first thought would be to avoid adding a new mode and make > MIGRATE_SYNC_LIGHT not block here. Then anyone that truly needs to > wait for all the pages to be migrated can use the heavier sync modes. > It seems to me like the current users of MIGRATE_SYNC_LIGHT would not > want to block for an unbounded amount of time here. What do you think? It appears that you can just use MIGRATE_ASYNC if you think the correct behavior is "NOT block at all". I found that there are more fine-grained controls on this in compaction code, please take a look at "enum compact_priority" and its comments. Best Regards, Huang, Ying