Re: [PATCH v2] receive-pack: not receive pack file with large object

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Sep 30 2021, Han Xin wrote:

> From: Han Xin <hanxin.hx@xxxxxxxxxxxxxxx>
>
> In addition to using 'receive.maxInputSize' to limit the overall size
> of the received packfile, a new config variable
> 'receive.maxInputObjectSize' is added to limit the push of a single
> object larger than this threshold.

Maybe an unfair knee-jerk reaction: I think we should really be pushing
this sort of thing into pre-receive hooks and/or the proc-receive hook,
i.e. see 15d3af5e22e (receive-pack: add new proc-receive hook,
2020-08-27).

The latter pre-dates c08db5a2d0d (receive-pack: allow a maximum input
size to be specified, 2016-08-24), which may or may not be relevant
here.

Anyway, I think there may be dragons here that you haven't
considered. Is the "size" here the absolute size on disk, or the delta
size (I'm offhand not familiar enough with unpack-objects.c to
know). Does this have the same semantics no matter the
transfer.unpackLimit?

Either way, if it's the absolute size you may have a 100MB object that's
a fantastic delta candidate, so it may only add a few bytes to your
repo, or it's /dev/urandom output and really is adding 100MB.

If we're relying on deltas there's no guarantee that what the client is
sending is delta-ing against something we can delta against, although
admittedly this is also an issue with receive.maxInputSize.

Anyway, all of this is stuff that people "in the wild" don't consider
already, so maybe I'm being too much of a curmudgeon here :)

At an ex-job I re-wrote some "big push" blocking hook that had had N
iterations of other implementations getting it subtly wrong. I.e. if you
define "size" the wrong way you end up blocking things like the revert
that reverts "master" to yesterday's commit.

That's somehing that takes git almost no size at all to store, but which
depending on the stupidity of the hook that's on the other end may be
blocked as a "big push".

So I think if we're going to expand on the existing
"receive.maxInputSize" we should really be considering the edge cases
carefully.

But even then I'm somewhat skeptical of the benefit of doing this in
git's own guts v.s. a hook, or expanding the hook infrastructure to
accommodate it, we have "receive.maxInputSize", now maybe
"receive.maxInputObjectSize", then after that perhaps
"receive.maxInputBinaryObjectSize",
"receive.maxInputObjectSizeGlob=*.mp3" etc.

I'm not saying our hook infrastructure is good enough for this right
now, but surely anything that would be wanted as a check in this area
could be done by something that "git cat-file --batch"-style feeds OIDs
to a new hook over stdin from "index-pack" and "unpack-objects"? With
that hook aware of the temporary staged objects in the incoming-* area
of the store?. I.e. if the performance aspect of not waiting until
"pre-receive" time is a concern...



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux