Shawn Pearce, Wed, Nov 08, 2006 18:11:31 +0100: > > >All true. However what happens when the header spans two windows? > > >Lets say I have the first 4 MiB mapped and the next 4 MiB mapped in > > >a different window; these are not necessarily at the same locations > > >within memory. Now if an object header is split over these two > > >then some bytes are at the end of the first window and the rest > > >are at the start of the next window. > > > > Assuming these are adjacent windows, we can just increment counters on the > > all touched pages (at least the two together) and return the pointer into > > the lowest page. Otherwise - time for garbage collection (why produce the > > garbage at all, btw?) and remap. > > They are adjacent in the pack file but not necessarily in virtual memory! Oh, right! Don't know why I thought the mapped regions would be connected. > The garbage creation is to account for the 2-4 windows required > by most applications. Most of the time each window is unused; > we really only have two windows in use during delta decompression, > at all other times we really only have 1 window in use. The commit > parsing applications don't keep the commit window in use when they > go access a tree or a blob. So they actually can call unuse_pack to unmap the window, but it's kept for caching reasons? > Consequently we want the garbage there. Actually I shouldn't have > used garbage: the correct term would be LRU managed cache. :-) > When we need a new window and we would exceed our maximum limit > (128 MiB in my implementation) we unmap the least recently used > window which is not currently in use. Yep, noticed that :) Just wondered why. > I could be wrong. It may not matter. But I think its crazy to > unmap otherwise valid mappings just because 2 bytes are on the > wrong side of an arbitrary boundary. You're right, would be unfortunate to remap too often. use_pack always maps at least 20 bytes, if I understand in_window and its use correctly. Actually, now I'm staring at it longer, I think the interface I suggested does almost the same, just allows to configure (well, hint at) the amount of bytes to be mapped in. I still can't let go of the idea to get as much data as possible with just one call to sliding window code. Calling use_pack for every byte just does not seem right. - To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html