Re: [PATCH 7/7] commit-graph.c: introduce '--[no-]check-oids'

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Apr 24, 2020 at 12:59:57PM +0200, SZEDER Gábor wrote:
> On Wed, Apr 22, 2020 at 05:39:30PM -0600, Taylor Blau wrote:
> > > > diff --git a/Documentation/git-commit-graph.txt b/Documentation/git-commit-graph.txt
> > > > index 46f7f7c573..91e8027b86 100644
> > > > --- a/Documentation/git-commit-graph.txt
> > > > +++ b/Documentation/git-commit-graph.txt
> > > > @@ -82,6 +82,11 @@ tip with the previous tip.
> > > >  Finally, if `--expire-time=<datetime>` is not specified, let `datetime`
> > > >  be the current time. After writing the split commit-graph, delete all
> > > >  unused commit-graph whose modified times are older than `datetime`.
> > > > ++
> > > > +The `--[no-]check-oids` option decides whether or not OIDs are required
> > > > +to be commits. By default, `--check-oids` is implied, generating an
> > > > +error on non-commit objects. If `--no-check-oids` is given, non-commits
> > > > +are silently discarded.
> > >
> > > What happens with OIDs of tags, in particular with OIDs of tags that
> > > can be peeled down to commit objects?  According to (my (too
> > > pedantic?) interpretation of) the above description they will trigger
> > > an error with '--check-oids' or will be ignored with
> > > '--no-check-oids'.  The implementation, however, accepts those oids
> > > and peels them down to commit objects; I think this is the right
> > > behaviour.
> >
> > That's right, and certainly merits a mention in the documentation. I've
> > added that...
> >
> > > What happens with OIDs that name non-existing objects?
> >
> > ...these are silently discarded. I think that you could make a
> > compelling argument in either direction on this one, but I'm slightly
> > swayed towards "discard these, too", since '--no-check-oids' is
> > literally saying "don't check these".
>
> I don't want to argue either way, but I'd argue for making a conscious
> decision that is justified in the commit message and documented in the
> docs.

Me either, I very much welcome your consistently thoughtful replies
(even if my extreme delay in responding to this one would suggest
otherwise... ;)).

> So, the option is '--stdin-commits' or '--input=stdin-commits', but
> it's not only about commits.  Now, allowing OIDs of tags pointing to
> commits and peeling them makes obvious sense, because we want commits
> reachable from those tags included in the commit-graph file.  Allowing
> OIDs of tags pointing to non-commits and silently ignoring them makes
> sense (to better support your 'git f-e-r ... | git c-g write ...' use
> case), though it's not that obvious (after all I managed to overlook
> it back then, that's why we are now here discussing these
> '--check-oids' patches).
>
> But I'm not sure about silently ignoring OIDs pointing to non-existent
> objects, because those might be a sign of some issues in whatever is
> generating the list of objects to be fed to 'git c-g write'.  E.g. there
> could be a race between 'git for-each-ref' listing OIDs and some other
> processes pruning them.  Does this worth worrying about?  Dunno...
> but at least let's think it through, and record in the commit message
> why we made that decision, whatever that decision might be.

Yeah, I think that the most reasonable behavior is definitely that we
should complain about non-existent objects over 'git commit-graph write
--stdin-commits' no matter if '--[no-]check-oids' is given or not.

But, let's step back for a minute. What are we actually hoping to
accomplish with '--check-oids'? I wrote this patch because I wanted a
way to support 'git for-each-ref' piping into 'git commit-graph write'
without having to juggle which tags peel down to commits and which
don't.

Now, I figured that it would be unreasonable to change the default
behavior of 'git commit-graph write --stdin-commits' (which is to
complain and error out in this circumstance), so I added
'--no-check-oids' as a way to avoid that behavior for callers that want
that.

But, are there ever any callers that *wouldn't* want this behavior? As
far as I can tell, probably not. We're only going to be permitting
*more* inputs to 'git commit-graph write', and I seriously doubt that
anybody is depending on the above behavior. (Of course, if that's not
the case, I'd love for somebody to speak up here and we can continue
the course on this patch).

So, I propose the following:

  * We drop the idea of '--[no-]{check,verify}-oids', and always
    silently ignore non-commit inputs, retaining the existing behavior
    of always complaining about things that aren't valid hex OIDs, such
    as "HEAD".

  * We always error out on missing or corrupt commit OIDs, including
    valid OIDs that don't resolve to any object, or resolve to a tag
    that can't be fully peeled.

Does that seem reasonable?

> > I guess that pushes us into the territory of whether or not "check" is
> > the right verb. "verify"?
>
> Oh, I didn't think about this, but now that you mention it we have
> '--verify' in 'git rev-parse', 'git tag' and elsewhere, and we have
> 'verify-commit', 'verify-path' and 'verify-tag' commands.  So
> '--verify-oids' might be more consistent.  I kind of like the 'oids'
> suffix in the option name, though I don't know what else we might want
> to verify in this command in the future...
>
> > "scrutinize" :)?
>
> Huhh, erm, no ;)
>
> > If you're otherwise satisfied with this series, here's the updated
> > patch.
>
> I haven't yet looked closely at the rest of the series...  The
> documentation update in the updated patch below looks good to me,
> thanks.
>
> > -- >8 --
> >
> > Subject: [PATCH] commit-graph.c: introduce '--[no-]check-oids'
> >
> > When operating on a stream of commit OIDs on stdin, 'git commit-graph
> > write' checks that each OID refers to an object that is indeed a commit.
> > This is convenient to make sure that the given input is well-formed, but
> > can sometimes be undesirable.
> >
> > For example, server operators may wish to feed the full commit object
> > IDs pointed to by refs that were updated during a push to 'git
> > commit-graph write --input=stdin-commits', and silently discard any
> > input that doesn't point at a commit. This can be done by combing the
> > output of 'git for-each-ref' with '--format %(*objecttype)', but this
> > requires opening up a potentially large number of objects.  Instead, it
> > is more convenient to feed the updated object IDs to the commit-graph
> > machinery, and let it throw out whatever remains.  to commits.
>
> Either the bulk of a sentence is missing, or there is a stray(?) "to
> commits." at the end of this line.
>
> > Introduce '--[no-]check-oids' to make such a behavior possible. With
> > '--check-oids' (the default behavior to retain backwards compatibility),
> > 'git commit-graph write' will barf on a non-commit line in its input.
> > With '--no-check-oids', such lines will be silently ignored, making the
> > above possible by specifying this option.
> >
> > No matter which is supplied, 'git commit-graph write' retains the
> > behavior from the previous commit of rejecting non-OID inputs like
> > "HEAD" and "refs/heads/foo" as before.
> >
> > Signed-off-by: Taylor Blau <me@xxxxxxxxxxxx>
> > ---
> >  Documentation/git-commit-graph.txt |  6 ++++++
> >  builtin/commit-graph.c             | 11 ++++++++---
> >  t/t5318-commit-graph.sh            | 28 ++++++++++++++++++++++++++++
> >  3 files changed, 42 insertions(+), 3 deletions(-)
> >
> > diff --git a/Documentation/git-commit-graph.txt b/Documentation/git-commit-graph.txt
> > index 46f7f7c573..6bdbe42766 100644
> > --- a/Documentation/git-commit-graph.txt
> > +++ b/Documentation/git-commit-graph.txt
> > @@ -82,6 +82,12 @@ tip with the previous tip.
> >  Finally, if `--expire-time=<datetime>` is not specified, let `datetime`
> >  be the current time. After writing the split commit-graph, delete all
> >  unused commit-graph whose modified times are older than `datetime`.
> > ++
> > +The `--[no-]check-oids` option decides whether or not OIDs are required
> > +to be commits. By default, `--check-oids` is implied, generating an
> > +error on non-commit objects. If `--no-check-oids` is given, non-commits
> > +and non-existent objects are silently discarded. In either case, tags
> > +are peeled down to the object they reference.

Thanks,
Taylor



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux