On Wed, 16 May 2007, Shawn O. Pearce wrote: > Johannes Schindelin <Johannes.Schindelin@xxxxxx> wrote: > > Don't forget that those 10% probably do not do you the favour to be in > > large chunks. Chances are that _every_ _single_ wanted object is separate > > from the others. > > That's completely possible. Assuming the objects even are packed > in the first place. Its very unlikely that you would be able to > fetch very large of a range from an existing packfile, you would be > submitting most of your range requests for very very small sections. Well, in the commit objects case you're likely to have a bunch of them all contigous. For tree and blob objects it is less likely. And of course there is the question of deltas for which you might or might not have the base object locally already. Still... I wonder if this could be actually workable. A typical daily update on the Linux kernel repository might consist of a couple hundreds or a few tousands objects. This could still be faster to fetch parts of a pack than the whole pack if the size difference is above a certain treshold. It is certainly not worse than fetching loose objects. Things would be pretty horrid if you think of fetching a commit object, parsing it to find out what tree object to fetch, then parse that tree object to find out what other objects to fetch, and so on. But if you only take the approach of fetching the pack index files, finding out about the objects that the remote has that are not available locally, and then fetching all those objects from within pack files without even looking at them (except for deltas), then it should be possible to issue a couple requests in parallel and possibly have decent performances. And if it turns out that more than, say, 70% of a particular pack is to be fetched (you can determine that up front), then it might be decided to fetch the whole pack. There is no way to sensibly keep those objects packed on the receiving end of course, but storing them as loose objects and repacking them afterwards should be just fine. Of course you'll get objects from branches in the remote repository you might not be interested in, but that's a price to pay for such a hack. On average the overhead shouldn't be that big anyway if branches within a repository are somewhat related. I think this is something worth experimenting. Nicolas - To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html