On Wed, Feb 13, 2008 at 02:42:09AM -0500, Shawn O. Pearce wrote: > Nicolas Pitre <nico@xxxxxxx> wrote: > > I think this is a good idea to always have some sanity checks on any > > incoming objects so to make sure they're well formed and valid before > > giving them a SHA1 value, and bail out as soon as any error is found. > > When we get the raw data for an object so we can compute its SHA-1 > and/or write its loose object to disk we should first verify its > content is sane, then do the SHA-1/store loose. > [...] > > As to making sure those objects are well connected... well this is a > > technically different issue entirely, and I wonder if a special mode to > > fsck might not be a better solution. > > Nah, just do what quickfetch does in builtin-fetch.c, but run it > in receive-pack, between unpack() and execute_commands(): > > rev-list --quiet --objects $new... --not --all > > If it aborts, reachability testing failed and the push is rejected > without updating any refs. Yes your repository now has objects > that are missing things, but none of those are considered to be > reachable, so this isn't a big deal. They will get cleaned up on > the next `gc --prune`, whenever that is. This would mean, that we must make git-rev-list and git-pack-objects not segfault on incorrect links between objects. > In this configuration (--quiet) rev-list tries to be pretty low > on its memory usage, it doesn't save buffers, etc. Further since > everything that is already considered reachable is not interesting, > we are only doing a walk over the objects that we just received, > not our entire ODB. Its also after index-pack exited, so we just > freed up a good chunk of memory. > > Rememeber we are talking about receive-pack here. The cost on > the to perform the rev-list is lower than the cost will be to pack > these objects for distribution back to just one client. Since this > is a server of some sorts (otherwise why did you push here?), odds > are its going to be doing a lot of packing requests for clients to > receive these newly uploaded objects by the native git protocol. > This new rev-list is nothing compared to that already existing load. > And if your OS is any good the just created .idx and .pack is still > in OS buffer cache, so there shouldn't be any additional disk IO. > > Yes, we could make this optional in receive-pack, but really I don't > see a reason to. Just run it. The client shouldn't be giving us > unreachable crap. Looks sane to me. mfg Martin Kögler - To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html