Re: performance on repack

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On Sun, 12 Aug 2007, David Kastrup wrote:

> "Jon Smirl" <jonsmirl@xxxxxxxxx> writes:
> 
> > If anyone is bored and looking for something to do, making the delta
> > code in git repack multithreaded would help.
> 
> I severely doubt that.  It is like the "coding stuff in assembly
> language will make it faster" myth.  The problem is that of manageable
> complexity.  Making the stuff multithreaded or coded in assembly means
> that it becomes inaccessible for a sound algorithmic redesign.

I have to admit that I'm not a huge fan of threading: the complexity and 
locking often kills you, if memory bandwidth constraints do not, and the 
end result is often really really hard to debug.

That said, I suspect we could some some *simple* form of this by just 
partitioning the problem space up - we could have a MT repack that 
generates four *different* packs on four different CPU's: each thread 
taking one quarter of the objects. At that point, you wouldn't even need 
threads, you could do it with regular processes, since the problem set is 
fully partitioned ocne you've generated the list of objects!

Then, after you've generated four different packs, doing a "git gc" 
(without any threading) will repack them into one big pack, and mostly 
just re-use the existing deltas.

So this would not be a generic thing, but it could be somethign that is 
useful for the forced full-repack after importing a large repository with 
fast-import, for example.

So while I agree with David in general about the problem of threading, I 
think that we can possibly simplify the special case of repacking into 
something less complicated than a "real" multi-threading problem.

		Linus
-
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux