Re: win2k/cygwin cannot handle even moderately sized packs

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Alex Riesen <fork0@xxxxxxxxxxx> wrote:
> Shawn Pearce, Wed, Nov 08, 2006 18:11:31 +0100:
> > The garbage creation is to account for the 2-4 windows required
> > by most applications.  Most of the time each window is unused;
> > we really only have two windows in use during delta decompression,
> > at all other times we really only have 1 window in use.  The commit
> > parsing applications don't keep the commit window in use when they
> > go access a tree or a blob.
> 
> So they actually can call unuse_pack to unmap the window,
> but it's kept for caching reasons?

Actually very few parts of the code even know about the windows.
Really the only parts that know it are the ones that directly
access the pack file, which is mostly restricted to sha1_file.c.

So since all access is through the more public interfaces what
you find is that the application code never keeps the window.
We are always doing use_pack/unuse_pack on every object access.
So the window is almost never in use.  So if we didn't hang onto
it in an LRU we would be in a world of hurt performance wise.

> > I could be wrong.  It may not matter.  But I think its crazy to
> > unmap otherwise valid mappings just because 2 bytes are on the
> > wrong side of an arbitrary boundary.
> 
> You're right, would be unfortunate to remap too often.
> 
> use_pack always maps at least 20 bytes, if I understand in_window and
> its use correctly. Actually, now I'm staring at it longer, I think the
> interface I suggested does almost the same, just allows to configure
> (well, hint at) the amount of bytes to be mapped in.

True; but if you look nobody wants more than 20 bytes.  They either
want <20 for the object header or 20 for the base object id in
a delta.  Otherwise they are shoving the data into zlib which
doesn't care.  No need to configure it, just shove it in.
 
> I still can't let go of the idea to get as much data as possible with
> just one call to sliding window code. Calling use_pack for every byte
> just does not seem right.

True.  But the only other idea I have is to copy the data into a
buffer for the caller.  Which we use only for the header section,
being that its small...  we already copy the delta base (20 bytes)
onto the stack during decompression.  Might as well copy the header
to decompress it.  Then you can batch up the range checks to at
worst no more than 2 range checks per header.

-- 
Shawn.
-
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]