Hi Nick, [Context - directory ls taking 4-15 seconds; directory large, with long filenames, but nowhere near as huge as Valdis' mail directory.] I've now discovered a really bizarre pattern, and I'm inclined to stop blaming the file system until some clarity develops. If I ever get it to the point where I can produce a high quality bug report - with or without patch - I will do so - but what I have now is anything but clear and high quality. On Jul 30 2014, Nick Krause wrote: > On Wed, Jul 30, 2014 at 3:48 PM, <Valdis.Kletnieks@xxxxxx> wrote: > > On Wed, 30 Jul 2014 10:38:13 -0700, Arlie Stephens said: > > > >> On the good side, Vladis' observations of his mail directory have been > >> a great help. > > > > And remember, that's on a single laptop-class hard drive, no fancy raid or > > anything. (Though it *is* a hybrid, with 32G of flash cache on the front end). > > > > You throw some *real* hardware at it, it of course would go even faster. > > Just send me the logs and anything else you think may help me. > Please note cc the ext4 mailing list as this will also let the other > ext4 developers and maintainers known about your problem. > Cheers Nick I'm now in a state of complete bafflement. It turns out we have a whole collection of misbehaving directories, making this testable without waiting for caches to clear. I have a couple of strace's of fast ls's, and a function ftrace that captured about half of a 7 second ls. (The latter is huge, and probably not suitable for posting.) I also have a really bizarre observation, the kind that makes you wonder whether you are actually dreaming. It appears that the misbehaviour is strongly influenced by the choice of "time" function. The problem only occurs when using the shell built-in. /usr/bin/time always produces a fast response. Stranger still - flat out impossible, I'd have said before seeing it - a "fast" ls, run with /usr/bin/time can be followed *immediately* by a slow "ls", run with bash' time. It's as if the first one doesn't warm the cache, which is completely absurd - except I've been able to make this happen 5 times in a row, first with strace and then without. # with /usr/bin/time the ls is fast $ time -p ls bad_dir ... real 0.21 user 0.00 sys 0.00 # with the builtin time, right *after* the strace run, the time can be # horrible. $ time -p ls bad_dir ... real 5.60 user 0.00 sys 0.17 # run it again, and the directory is in cache as expected. $ time -p ls bad_dir ... real 0.11 user 0.00 sys 0.02 This is not an artefact of one or other time reporting incorrectly - I'm noticing a long pause before output occurs, but only on the middle test of the three. I can't imagine any sane way for this to be happening, short of coincidence or user error - and I've now seen this sequence 5 times in a row, on 5 different directories created and populated by the same app. (Three times with strace, twice without.) -- Arlie (Arlie Stephens arlie@xxxxxxxxxxxx) _______________________________________________ Kernelnewbies mailing list Kernelnewbies@xxxxxxxxxxxxxxxxx http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies