Re: git-daemon memory usage, disconnection.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On Wed, 19 Apr 2006, David Woodhouse wrote:

> On Wed, 2006-04-19 at 07:59 -0700, Linus Torvalds wrote:
> > Well, you've probably got two issues: 
> > 
> >  - it looks like you aren't packing your archives (which explains why the 
> >    disk accesses are horrid, which in turn explains the "D" part).
> 
> Hm, good point. They're fairly new trees -- I had foolishly assumed that
> they would at least start off packed. That isn't the case though --
> perhaps it should be? Did the original clone receive a pack on the wire
> and then _split_ it?

For old versions of git, yes.

> If the tools would automatically pack when the number of unpacked
> objects reaches a threshold, that would be useful.

Well, packing is still best done in the background: you don't generally 
want the tools to just stop for a minute to repack while you're doing 
something. You'd normally want to do a cron run at 4AM or something, see 
if there is lots to pack, and repack that.

The one exception is probably a large conversion process (from CVS, SVN, 
whatever). The conversion process itself probably takes ages, and it will 
be even slower if it were to keep the potentially huge result unpacked all 
the time.

But for normal ops, you really don't want to repack synchronously.

> Since this repo is only available through git:// and git+ssh:// URLs, I
> can safely use git-repack's '-a -d' options, right?

Yes.

> I'll do 'git-repack -l' nightly and 'git-repack -a -d -l' weekly -- does
> that seem sane?

Absolutely. The one exception might be trees that really don't change very 
much (which is quite common), so you might make it conditional on seeing 
if there are _any_ objects at all in .git/objects/00/, for example. Not 
that repack will be very expensive, but still..

> Well, it does that with SIGALRM happening periodically, theoretically
> for the purpose of providing progress output. Perhaps we could do a
> getpeername() or something else to check on the output fd each time?

Yes, that's possibly a good idea. Of course, for git-rev-list, it's just a 
pipe, and it's hard to do that check at least portably. On Linux, doing a 
"poll()" on a pipe for writing, with newer kernels you'll get a POLLERR if 
the other side has hung up, but that's by no means portable.

(On some other systems, doing a zero-sized write() _might_ do it, but at 
least Linux will happily say "ok, wrote 0 bytes" even if the other end 
isn't listening).

And git-rev-list isn't doing the SIGALARM anyway.

In other words, to do this, we'd have to change send-pack to use the 
revision library. Which, as mentioned, is worth-while anyway, but it's not 
totally trivial.

		Linus
-
: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]