On Sun, 7 Jan 2007, Christoph Hellwig wrote: > > On Sun, Jan 07, 2007 at 10:03:36AM +0100, Willy Tarreau wrote: > > The problem is that I have no sufficient FS knowledge to argument why > > it helps here. It was a desperate attempt to fix the problem for us > > and it definitely worked well. > > XFS does rather efficient btree directories, and it does sophisticated > readahead for directories. I suspect that's what is helping you there. The sad part is that this is a long-standing issue, and the directory reading code in ext3 really _should_ be able to do ok. A year or two ago I did a totally half-assed code for the non-hashed readdir that improved performance by an order of magnitude for ext3 for a test-case of mine, but it was subtly buggy and didn't do the hashed case AT ALL. Andrew fixed it up so that it at least wasn't subtly buggy any more, but in the process it also lost all capability of doing fragmented directories (so it doesn't help very much any more under exactly the situation that is the worst case), and it still doesn't do the hashed directory case. It's my personal pet peeve with ext3 (as Andrew can attest). And it's really sad, because I don't think it is fundamental per se, but the way the directory handling and jdb are done, it's apparently very hard to fix. (It's clearly not _impossible_ to do: I think that it should be possible to treat ext3 directories the same way we treat files, except they would always be in "data=journal" mode. But I understand ext2, not ext3 (and absolutely not jbd), so I'm not going to be able to do anything about it personally). Anyway, I think that disabling hashing can actually help. And I suspect that even with hashing enabled, there should be some quick hack for making the directory reading at least be able to do multiple outstanding reads in parallel, instead of reading the blocks totally synchronously ("read five blocks, then wait for the one we care" rather than the current "read one block at a time, wait for it, read the next one, wait for it.." situation). Linus - To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html