El 22/03/11 19:37, Andrea Arcangeli escribió:
Hi Alex, could you also try to reverse this below bit (not the whole previous patch: only the bit below quoted below) with "patch -p1 -R < thismail" on top of your current aa.git tree, and see if you notice any regression compared to the previous aa.git build that worked well? This is part of the fix, but I'd need to be sure this really makes a difference before sticking to it for long. I'm not concerned by keeping it, but it adds dirt, and the closer THP allocations are to any other high order allocation the better. So the less __GFP_NO_KSWAPD affects the better. The hint about not telling kswapd to insist in the background for order 9 allocations with fallback (like THP) is the maximum I consider clean because there's khugepaged with its alloc_sleep_millisecs that replaces the kswapd task for THP allocations. So that is clean enough, but when __GFP_NO_KSWAPD starts to make compaction behave slightly different from a SLUB order 2 allocation I don't like it (especially because if you later enable SLUB or some driver you may run into the same compaction issue again if the below change is making a difference). If things works fine even after you reverse the below, we can safely undo this change and also feel safer for all other high order allocations, so it'll make life easier. (plus we don't want unnecessary special changes, we need to be sure this makes a difference to keep it for long) --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -2085,7 +2085,7 @@ rebalance: sync_migration); if (page) goto got_pg; - sync_migration = true; + sync_migration = !(gfp_mask & __GFP_NO_KSWAPD); /* Try direct reclaim and then allocating */ page = __alloc_pages_direct_reclaim(gfp_mask, order, I tried to reformat the stick as UDF to check whether the stall was filesystem-sensitive. Apparently it is. I managed to induce the freeze on firefox while performing the same copy on the aa.git kernel. Then I reformatted the stick as FAT32 and repeated the test, and it also induced freezes, although they were a bit shorter and occurred late in the copy progress. I have attached the traces in the bug report. All of this is with the kernel before reversing the quoted patch. |