Re: [PATCH] pack-objects: re-validate data we copy from elsewhere.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Junio C Hamano <junkio@xxxxxxx> wrote:
> It might be worthwhile to disable revalidate reused objects
> individually and instead scan and checksum the entire .pack file
> when the number of objects being reused exceeds certain
> threshold, relative to the number of objects in existing pack,
> perhaps.

Correct me if I'm wrong but didn't this revalidate check happen
because the SHA1 of the pack was correct but there was a bad bit
in the zlib stream?

If we are trying to detect such an error before removing the possibly
valid pack how are we supposed to do that if we are bypassing the
code on larger packs?


I think the better thing to do here is to not repack objects which
are already contained in very large packs.  Just leave them be.

If the pack you are about to copy an object out of is over 25 MiB,
you aren't outputting to stdout and the object isn't needed
as a delta base in the new pack then don't copy it.  Introduce a
new flag to git-pack-objects such as "--max-source-pack-size=100"
which can be used to change this 25 MiB threshold; setting it to
0 would act as "-a" does today.

This way users can repack with 'git repack -a -d' as though it were
free and much less frequently (such as once a year) combine their
medium sized packs together based on a larger maximum threshold
while still ignoring their really large historical packs.

Note that you are never bypassing the deflate validation; before
copying an object you *always* validate it is correct, even if the
source pack SHA1 is correct.  But this time consuming validation
should not be a big issue as users shouldn't repack very large
packs very frequently with this strategy.  E.g. some kernel devs
might repack once a year with --max-source-pack-size=512 (512 MiB)
but during normal use accept the 25 MiB default and the slightly
larger number of small packs that result.

-- 
Shawn.

-- 
VGER BF report: U 0.5
-
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]