Re: [PATCH 3/3] commit-graph: error out on invalid commit oids in 'write --stdin-commits'

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Apr 03, 2020 at 09:38:42PM +0200, SZEDER Gábor wrote:
> On Fri, Apr 03, 2020 at 12:49:33PM -0600, Taylor Blau wrote:
> > On Fri, Apr 03, 2020 at 02:30:57PM -0400, Jeff King wrote:
> > > On Mon, Aug 05, 2019 at 10:02:40AM +0200, SZEDER Gábor wrote:
> > >
> > > > While 'git commit-graph write --stdin-commits' expects commit object
> > > > ids as input, it accepts and silently skips over any invalid commit
> > > > object ids, and still exits with success:
> > > >
> > > >   # nonsense
> > > >   $ echo not-a-commit-oid | git commit-graph write --stdin-commits
> > > >   $ echo $?
> > > >   0
> > > >   # sometimes I forgot that refs are not good...
> > > >   $ echo HEAD | git commit-graph write --stdin-commits
> > > >   $ echo $?
> > > >   0
> > > >   # valid tree OID, but not a commit OID
> > > >   $ git rev-parse HEAD^{tree} | git commit-graph write --stdin-commits
> > > >   $ echo $?
> > > >   0
> > > >   $ ls -l .git/objects/info/commit-graph
> > > >   ls: cannot access '.git/objects/info/commit-graph': No such file or directory
> > > >
> > > > Check that all input records are indeed valid commit object ids and
> > > > return with error otherwise, the same way '--stdin-packs' handles
> > > > invalid input; see e103f7276f (commit-graph: return with errors during
> > > > write, 2019-06-12).
> > >
> > > Can you explain more why the old behavior is a problem?
>
> Because when I do:
>
>    # sometimes I forgot that refs are not good...
>    $ echo HEAD | git commit-graph write --stdin-commits
>
> then I get _nothing_: neither an error, nor a commit-graph.
>
> > > For reasons (see
> > > below), we want to do something like:
> > >
> > >   git for-each-ref --format='%(objectname)' |
> > >   git commit-graph write --stdin-commits
> > >
> > > In v2.23 and earlier, that worked exactly like --reachable, but now it
> > > will blow up if there are any refs that point to a non-commit (e.g., a
> > > tag of a blob).
> > >
> > > It can be worked around by asking for %(objecttype) and %(*objecttype)
> > > and grepping the result, but that's awkward and much less efficient
> > > (especially if you have a lot of annotated tags, as we may have to open
> > > and parse each one).
> > >
> > > Now obviously you could just use --reachable for the code above. But
> > > here are two plausible cases where you might not want to do that:
> > >
> > >  - you're limiting the graph to only a subset of refs (e.g., you want to
> > >    graph refs/heads/ and refs/tags, but not refs/some-other-weird-area/).
> > >
> > >  - you're generating an incremental graph update. You know somehow that
> > >    a few refs were updated, and you want to feed those tips to generate
> > >    the incremental, but not the rest of the refs (not because it would
> > >    be wrong to do so, but in the name of keeping it O(size of change)
> > >    and not O(number of refs in the repo).
> > >
> > > The latter is the actual case that bit us. I suppose one could do
> > > something like:
> > >
> > >   git rev-list --no-walk <maybe-commits |
> > >   git commit-graph write --stdin-commits
> > >
> > > to use rev-list as a filter, but that feels kind of baroque.
> > >
> > > Normally I'm in favor of more error checking instead of less, but in
> > > this case it feels like it's putting scripted use at a disadvantage
> > > versus the internal code (e.g., the auto-write for git-fetch uses the
> > > "--reachable" semantics for its internal invocation).
> >
> > For what it's worth, (and in case it wasn't obvious) this came about
> > because we feed '--stdin-commits' at GitHub, and observed exactly this
> > error case. I wasn't sure what approach would be more palatable, so I
> > prepared both in my fork at https://github.com/ttaylorr/git:
> >
> >   - Branch 'tb/commit-graph-dont-check-oids' drops this checking
> >     entirely.
>
> Oh, no :)

Heh, I figured as much when Peff pointed me towards this thread ;-).

> >   - Branch 'tb/commit-graph-check-oids-option' adds a
> >     '--[no-]check-oids', in case that this is generally desirable
> >     behavior, by offering an opt-out of this OID checking.
> >
> > Please let me know if you find either of these to be good options, and
> > I'll happily send one of them to the list. Thanks.
>
> Or introduce 'git commit-graph write --stdin-refs'?  Or teach
> '--stdin-commits' to DWIM and accept and parse refs?  Though the
> question still remains what to do with refs that can't be peeled back
> to commits

How would '--stdin-refs' behave differently? I assume that it'd handle
going from 'refs/heads/blah' to an OID, but I think that's only kicking
the problem a little further down the road. Of course, you note the same
problem which is: what do we do when we can't resolve down to a commit
object.

I think that it seems the '--[no-]check-oids' may be the best of both
worlds, which allows for (mostly ;)) easy scripting around 'git
for-each-ref'.

> > > -Peff
> > >
> > > PS As an aside, I think the internal git-fetch write could benefit from
> > >    this same trick: feed the set of newly-stored ref tips to the
> > >    commit-graph machinery, rather than using for_each_ref().
> >
> > Thanks,
> > Taylor
Thanks,
Taylor



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux