On Thu, Nov 13, 2008 at 03:42:40PM -0500, Theodore Tso wrote: > Hmm... how very strange. Can you run the command: > > ps -eo pid,tid,class,rtprio,ni,pri,psr,pcpu,stat,wchan:16,comm > > during one of the quiescent periods, and see what is in the WCHAN > field for the unlink command? Mostly I see this: PID TID CLS RTPRIO NI PRI PSR %CPU STAT WCHAN COMMAND 5932 5932 TS - 0 19 0 2.4 D+ sync_buffer rm I had been running these benchmarks over dm-crypt (which is the target environment for which I was testing). So I re-ran the tests both bare and with dm-crypt to compare. The stalls reported by vmstat did not show up when running the test bare. I also sat and watched the drive LED while the unlink test was running. The drive LED showed the occasional 1/2 second stall, but at the same time vmstat was showing 5-20 second stalls, so there that would seem to point to some kind of reporting problem. For completeness, I re-ran the benchmark both with and without dm-crypt to see if it was the cause of the problem, all times mounting with barrier=0. As expected, it did slow down the process, but the problem remains: ext4 128 plain unlink: 665.523 elapsed 0.276 user 34.882 sys 5.28% ext4 128 crypt unlink: 907.934 elapsed 0.356 user 34.698 sys 3.86% ext4 256 plain unlink: 1435.964 elapsed 0.248 user 40.319 sys 2.82% ext4 256 crypt unlink: 1504.660 elapsed 0.304 user 35.186 sys 2.35% ext3 128 plain unlink: 133.863 elapsed 0.248 user 24.618 sys 18.57% ext3 128 crypt unlink: 133.092 elapsed 0.280 user 23.661 sys 17.98% ext3 256 plain unlink: 309.635 elapsed 0.296 user 27.362 sys 8.93% ext3 256 crypt unlink: 319.819 elapsed 0.268 user 23.713 sys 7.49% Is there anything else I can try to see what's happening? -- Bruce Guenter <bruce@xxxxxxxxxxxxxx> http://untroubled.org/
Attachment:
pgp5nHY7ZHmWW.pgp
Description: PGP signature