Re: [PATCH 4/6] receive-pack: quarantine objects until pre-receive accepts

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sat, Oct 1, 2016 at 11:12 AM, Jeff King <peff@xxxxxxxx> wrote:
> On Fri, Sep 30, 2016 at 03:36:30PM -0400, Jeff King wrote:
>
>> @@ -1639,6 +1666,18 @@ static const char *unpack(int err_fd, struct shallow_info *si)
>>               argv_array_push(&child.args, alt_shallow_file);
>>       }
>>
>> +     tmp_objdir = tmp_objdir_create();
>> +     if (!tmp_objdir)
>> +             return "unable to create temporary object directory";
>> +     child.env = tmp_objdir_env(tmp_objdir);
>
> One thing to note here: this new code kicks in all the time. My
> reasoning was that there's basically no time you _wouldn't_ want it to,
> and certainly that was the case for us when I wrote it. But I tried to
> think of user-visible changes. Here's what I came up with:
>
>   - we currently leave the tmp_pack_* for a failed push sitting around
>     (e.g., if the client hangs up halfway through, or index-pack rejects
>     the pack for some reason). But with this series, it would always be
>     cleaned up. That's a very good thing if you're running a git hosting
>     site. It might make things harder if you're debugging.
>
>     I don't think it's a good reason not to enable this by default, but
>     it _could_ be a reason to have a config switch to turn it off
>     temporarily (or just leave the "incoming-*" directory in place).
>
>   - the environment that pre-receive pack runs in has access to objects
>     that the rest of the repository doesn't. So if you were to do
>     something silly in your pre-receive like:
>
>       # reject the push, but log a copy of the objects
>       git update-ref refs/rejected/$(date +%s) $new_sha1
>       exit 1
>
>     Then your ref-update would succeed (you have $new_sha1), but the
>     objects would be deleted immediately afterwards. I find this a
>     somewhat questionable pattern, and I have no idea if anybody else
>     has thought of it. But it _does_ work today, and not with this
>     series.
>
> I don't think it would be too hard to put a config conditional around
> this tmp_objdir_create(). And then all of the tmp_objdir_env() calls
> would just return NULL, and effectively become noops.

Very late reply, but I have a question about this. Is there anything
you can do on the plumbing level to figure out which area an object is
in (of course that's not mutually exclusive).

The use-case for that is e.g. having a hook that rejects large
binaries, but has an exception for a binary that's been added in the
past, before your change there's no optimal way to find that out,
you'd need to go over the whole history and list all blobs that have
ever been added, with your change it would be trivial if something
could give me "this object is in the quarantine area", but AFAICT
there's no API that'll show me that.

Also, I think this whole thing could really do with some documentation
in githooks(5). E.g. what hooks does it apply for? The test is just
for pre-receive but the patch changes run_update_hook(), does it also
take effect for update? Ditto the caveat about update-ref.



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]