Another long mail, sorry. On Wednesday 14 October 2009, Frans Pop wrote: > > There still has not been a mm-change identified that makes > > fragmentation significantly worse. > > My bisection shows a very clear point, even if not an individual commit, > in the 'akpm' merge where SKB errors suddenly become *much* more > frequent and easy to trigger. > I'm sorry to say this, but the fact that nothing has been identified yet > is IMO the result of a lack of effort, not because there is no such > change. I was wrong. It turns out that I was creating the variations in the test results around the akpm merge myself by tiny changes in the way I ran the tests. It took another round of about 30 compilations and tests purely in this range to show that, but those same tests also made me aware of other patterns I should look at. Until a few days ago I was concentrating on "do I see SKB allocation errors or not". Since then I've also been looking more consciously at when they happen, at disk access patterns and at desktop freeze patterns. I think I did mention before that this whole issue is rather subtle :-/ So, my apologies for finguering the wrong area for so long, but it looked solid given the info available at the time. On Thursday 15 October 2009, Mel Gorman wrote: > Outside the range of commits suspected of causing problems was the > following. It's extremely low probability > > Commit 8aa7e84 Fix congestion_wait() sync/async vs read/write confusion > This patch alters the call to congestion_wait() in the page > allocator. Frankly, I don't get the change but it might worth > checking if replacing BLK_RW_ASYNC with WRITE on top of 2.6.31 > makes any difference This is the real culprit. Mel: thanks very much for looking beyond the area I identified. Your overview of mm changes was exactly what I needed and really helped a lot during my later tests. This commit definitely causes most of the problems; confirmed by reverting it on top of 2.6.31 (also requires reverting 373c0a7e, which is a later build fix). The rest of this mail gives details on my tests and how I reached the above conclusion. TEST BASELINE (2.6.30) ====================== I mentioned in an earlier mail that I run three instances of gitk for my tests. Loading gitk seems to consist of 3 phases: 1) general initial scan of the repository (branches?) 2) reading commits: commit counter increases 3) reading references (including bisection good/bad points) and uncommitted changes Below times and comments per stage when the test is run with 2.6.30. As my test starts after a clean boot, buffers are mostly empty. 1st instance: 'gitk v2.6.29..master' (preparation) 1) ~20 seconds; user interface is mostly blank 2) ~5 seconds to read 35.000 commits; user interface is updated and counter increases steadily as they are read 3) ~10 seconds; "branch"/"follows"/"precedes" info and tags are filled in; fairly heavy disk activity 2st instance: 'gitk master' (preparation) 1) 0 seconds (because data is already buffered) 2) ~25 seconds to read 167500 commits; counter increases steadily 3) 1-2 seconds (because data is already buffered) 3st instance: 'gitk master' (the actual test) 1) 0 seconds because data is already buffered 2) ~55 seconds due to swapping overhead; minor music skip around commit 110.000; counter slower after 90.000, some short halts, but generally increases steadily; moderate disk activity 3) ~55-60 seconds; because buffers have been emptied data must by read again, with swapping; very heavy disk activity; fairly long music skip (15-20 seconds), but no SKB allocation errors So, the loading of the 3rd instance takes 1.5 minutes longer than the second because of the swapping. And phase 3) is most affected by it. AFTER WIRELESS CHANGE ===================== After commit 4752c93c30 ("iwlcore: Allow skb allocation from tasklet") I start getting the SKB errors. They can be triggered reliably if the whole test is repeated 1 or 2 times, but generally not the first time the test is run. Or so I thought for a long time. It turns out that I will get SKB errors during the first run if I'm "sloppy" in the test execution. For example if I wait too long before switching from the last gitk instance to konsole where I have a 'tail -f /var/log/kern.log' running. Another factor is the state of the repository: do I have master checked out, or an older branch, or am I in the middle of a bisection. This influences how data is read from the disk and thus the test results. A last factor may be the size of the kernel I'm using: my test/bisect kernel is significantly smaller than my regular kernel. If the test is run completely cleanly, I will not get SKB errors during the first run. Also, this change does not affect the timings of the test at all: the total load time of the 3rd instance is still ~1:55 and music skips happen in roughly the same places. The pattern of disk activity also remains unchanged. If I do *not* run the test cleanly, any SKB errors during the first test run will always be during phase 3), never during phase 2). This is what I saw during tests in the 'akpm' range, and explains the inconsistent results there. After discovering this I've made a copy of the git repo so that I always test using the exact same state and tightened my test procedure. AFTER congestion_wait CHANGE ============================ If I test commit 9f2d8be, which is just before the congestion_wait() change, I still get the same pattern as described above. But when I test with 8aa7e84 ("Fix congestion_wait() sync/async vs read/write confusion"), things change dramatically when the 3rd gitk instance is started. During the 2nd phase I see the first SKB allocation errors with a music skip between reading commits 95.000 and 110.000. About commit 115.000 there is a very long pause during which the counter does not increase, music stops and the desktop freezes completely. The first 30 seconds of that freeze there is only very low disk activity (which seems strange); the next 25 seconds there suddenly is very high disk activity during which things gradually unfreeze and more SKB errors are displayed. After that the commit counter runs up fairly steadily again. Phase 2) ends at ~1:45. Phase 3) (with more SKB errors) ends at ~2:05. So this change almost doubles the time needed for phase 2) and causes SKB allocation errors to occur during that phase. Also, before this commit the desktop freezes are much shorter and less severe. With this change the desktop is completely unusable for almost a minute during phase 2), with even the mouse pointer frozen solid. Note that phase 3) becomes shorter, but that the total time needed to load the 3rd instance increases by about 10-15 seconds. Note: -rc2 and -rc3 had broken NFS, so I had to cherry-pick 3 NFS commits from -rc4 on top of the commits I wanted to test. WITH congestion_wait CHANGE REVERTED ==================================== I've done quite a few tests of 2.6.31 with 373c0a7e and 8aa7e847 reverted to confirm that's really the culprit. I've done this for .31-rc3, .31-rc4, .31-rc5, .31 and .31.1. In all cases the huge freeze in phase 2) is gone and the general behavior and timings are again as it was after the wireless change. During most tests I did not get any SKB allocation errors during phase 2) or phase 3). However with .31-rc5, .31 and .31.1 I have had some tests where I would see a few SKB allocation errors during phase 3) (which is somewhat likely), but also during phase 2). At this point I'm unsure whether this is just noise, or maybe a minor influence from some change merged after .31-rc4. Looking through the commits there are several mm/page allocation changes. For now I suggest ignoring this though as the impact (if any) is very minor and it is not reproducible reliably enough. Next I'll retest Mel's patches and also test Reinette's patches. Cheers, FJP -- To unsubscribe from this list: send the line "unsubscribe kernel-testers" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html