Re: [RFC Patch] Preventing corrupt objects from entering the repository

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Feb 13, 2008 at 02:42:09AM -0500, Shawn O. Pearce wrote:
> Nicolas Pitre <nico@xxxxxxx> wrote:
> > I think this is a good idea to always have some sanity checks on any 
> > incoming objects so to make sure they're well formed and valid before 
> > giving them a SHA1 value, and bail out as soon as any error is found.  
> 
> When we get the raw data for an object so we can compute its SHA-1
> and/or write its loose object to disk we should first verify its
> content is sane, then do the SHA-1/store loose.
> 
[...]
> > As to making sure those objects are well connected... well this is a 
> > technically different issue entirely, and I wonder if a special mode to 
> > fsck might not be a better solution.
> 
> Nah, just do what quickfetch does in builtin-fetch.c, but run it
> in receive-pack, between unpack() and execute_commands():
> 
> 	rev-list --quiet --objects $new... --not --all
> 
> If it aborts, reachability testing failed and the push is rejected
> without updating any refs.  Yes your repository now has objects
> that are missing things, but none of those are considered to be
> reachable, so this isn't a big deal.  They will get cleaned up on
> the next `gc --prune`, whenever that is.

This would mean, that we must make git-rev-list and git-pack-objects
not segfault on incorrect links between objects.

> In this configuration (--quiet) rev-list tries to be pretty low
> on its memory usage, it doesn't save buffers, etc.  Further since
> everything that is already considered reachable is not interesting,
> we are only doing a walk over the objects that we just received,
> not our entire ODB.  Its also after index-pack exited, so we just
> freed up a good chunk of memory.
> 
> Rememeber we are talking about receive-pack here.  The cost on
> the to perform the rev-list is lower than the cost will be to pack
> these objects for distribution back to just one client.  Since this
> is a server of some sorts (otherwise why did you push here?), odds
> are its going to be doing a lot of packing requests for clients to
> receive these newly uploaded objects by the native git protocol.
> This new rev-list is nothing compared to that already existing load.
> And if your OS is any good the just created .idx and .pack is still
> in OS buffer cache, so there shouldn't be any additional disk IO.
> 
> Yes, we could make this optional in receive-pack, but really I don't
> see a reason to.  Just run it.  The client shouldn't be giving us
> unreachable crap.

Looks sane to me.

mfg Martin Kögler
-
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux