Re: How git affects kernel.org performance

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On Sun, 7 Jan 2007, Christoph Hellwig wrote:
>
> On Sun, Jan 07, 2007 at 10:03:36AM +0100, Willy Tarreau wrote:
> > The problem is that I have no sufficient FS knowledge to argument why
> > it helps here. It was a desperate attempt to fix the problem for us
> > and it definitely worked well.
> 
> XFS does rather efficient btree directories, and it does sophisticated
> readahead for directories.  I suspect that's what is helping you there.

The sad part is that this is a long-standing issue, and the directory 
reading code in ext3 really _should_ be able to do ok. 

A year or two ago I did a totally half-assed code for the non-hashed 
readdir that improved performance by an order of magnitude for ext3 for a 
test-case of mine, but it was subtly buggy and didn't do the hashed case 
AT ALL. Andrew fixed it up so that it at least wasn't subtly buggy any 
more, but in the process it also lost all capability of doing fragmented 
directories (so it doesn't help very much any more under exactly the 
situation that is the worst case), and it still doesn't do the hashed 
directory case.

It's my personal pet peeve with ext3 (as Andrew can attest). And it's 
really sad, because I don't think it is fundamental per se, but the way 
the directory handling and jdb are done, it's apparently very hard to fix.

(It's clearly not _impossible_ to do: I think that it should be possible 
to treat ext3 directories the same way we treat files, except they would 
always be in "data=journal" mode. But I understand ext2, not ext3 (and 
absolutely not jbd), so I'm not going to be able to do anything about it 
personally).

Anyway, I think that disabling hashing can actually help. And I suspect 
that even with hashing enabled, there should be some quick hack for making 
the directory reading at least be able to do multiple outstanding reads in 
parallel, instead of reading the blocks totally synchronously ("read five 
blocks, then wait for the one we care" rather than the current "read one 
block at a time, wait for it, read the next one, wait for it.." 
situation).

			Linus
-
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]