Re: [PATCH v3 00/33] nd/sha1-name-c-wo-the-repository updates

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sat, Apr 13, 2019 at 7:14 PM SZEDER Gábor <szeder.dev@xxxxxxxxx> wrote:
>
> On Fri, Apr 12, 2019 at 04:25:08PM +0200, Johannes Schindelin wrote:
> > Hi,
> >
> > On Fri, 12 Apr 2019, SZEDER Gábor wrote:
> >
> > > On Thu, Apr 11, 2019 at 10:58:57PM +0200, SZEDER Gábor wrote:
> > > > On Thu, Apr 11, 2019 at 10:51:46PM +0200, SZEDER Gábor wrote:
> > > > > On Wed, Apr 10, 2019 at 10:56:52PM +0200, Johannes Schindelin wrote:
> > > > > > >         ...>}
> > > > > > >
> > > > > > >       @@
> > > > > > >     + expression c;
> > > > > > >     ++expression r;
> > > > > > >     + expression s;
> > > > > > >     + @@
> > > > > > >     +-- get_commit_tree(c) = s
> > > > > > >     ++- repo_get_commit_tree(r, c) = s
> > > > > > >     + + c->maybe_tree = s
> > > > > >
> > > > > > I think this is wrong, and admittedly I had the very same version
> > > > > > originally.
> > > > > >
> > > > > > When you have an arbitrary `r` in any `repo_get_commit_tree(r, c)` (as
> > > > > > opposed to `the_repository`), the conversion to `c->maybe_tree` is most
> > > > > > likely incorrect.
> > > > > >
> > > > > > Therefore, I don't think that we can do that.
> > > > >
> > > > > So, as far as I understand, the goal of these 'c->maybe_tree'-related
> > > > > semantic patches is to prevent "generic" parts of Git from accessing
> > > > > this field directly, as it might not be initialized in a
> > > > > commit-graph-enabled repository.
> > > > >
> > > > > Only three functions are explicitly exempt, while this last semantic
> > > > > patch in question implicitly allows a few more that assign a value to
> > > > > 'c->maybe_tree'.  These functions are release_commit_memory() and
> > > > > parse_commit_buffer() in 'commit.c' and fill_commit_in_graph() in
> > > > > 'commit-graph.c',
> > >
> > > ... and make_virtual_commit() in 'merge-recursive.c'.
> > >
> > > > and after a quick look these functions seem to be
> > > > > rather fundamenal in the life-cycle of a commit object.
> > > >
> > > > Erm, not "commit object"; I meant the life-cycle of a 'struct commit'
> > > > instance.
> > > >
> > > > > I think they deserve to be explicitly exempted, too, and then we could
> > > > > remove this last semantic patch altogether.
> > >
> > > And it would look like this.  Yeah, that's a very long line there, but
> > > I don't think we can break it up.
> > >
> > >   -- >8 --
> > >
> > > diff --git a/contrib/coccinelle/commit.cocci b/contrib/coccinelle/commit.cocci
> > > index 57c8f71479..fe814f313e 100644
> > > --- a/contrib/coccinelle/commit.cocci
> > > +++ b/contrib/coccinelle/commit.cocci
> > > @@ -10,20 +10,15 @@ expression c;
> > >  - c->maybe_tree->object.oid.hash
> > >  + get_commit_tree_oid(c)->hash
> > >
> > > -// These excluded functions must access c->maybe_tree direcly.
> > > +// These excluded functions must access/modify c->maybe_tree direcly.
> > > +// Note that if c->maybe_tree is written somewhere outside of these
> > > +// functions, then the recommended transformation will be bogus with
> > > +// repo_get_commit_tree() on the LHS.
> > >  @@
> > > -identifier f !~ "^(repo_get_commit_tree|get_commit_tree_in_graph_one|load_tree_for_commit)$";
> > > -expression c;
> > > +identifier f !~ "^(repo_get_commit_tree|get_commit_tree_in_graph_one|load_tree_for_commit|fill_commit_in_graph|parse_commit_buffer|release_commit_memory|make_virtual_commit)$";
> >
> > Hahahaha! That's *really* long.
> >
> > And a good indicator that this should be hidden in a single helper
> > function (`set_commit_tree()`, file-local of course) that is exempted in
> > the cocci patch.
>
> Note that this is not only about line length, and consider the slight
> differences between the three approaches:
>
>   - Currently only direct read accesses to 'c->maybe_tree' outside of
>     the listed functions are forbidden and transformed, but still any
>     function can set this field directly (thanks to the last semantic
>     patch in 'commit.cocci').
>
>   - Encapsulating writes in set_commit_tree() and adding this function
>     to that list would prevent other functions from setting
>     'c->maybe_tree' directly, but still any function could set it
>     indirectly by calling set_commit_tree().
>
>   - With that loooong line only those few listed special functions
>     would be able read or write 'c->maybe_tree'.
>
> Does the additional restriction of the long line variant bring us
> benefits?  Well, not sure.
>
> The root tree of a commit is needed in many places, and in the past we
> got used to it being always initialized in any 'struct commit'
> instance.  However, with the commit graph that's not the case anymore,
> and any read accesses to the uninitialized root tree object would have
> bad consequences.  That's why there is get_commit_tree() helper
> performing lazy-initialization, and the protection from direct reads
> in the form of the semantic patch.

And I think set_commit_tree() is a good way to go. Basically
"maybe_tree" is not safe to be manipulated directly because of traps
and pitfalls. If we have one entry point to update maybe_tree, we can
handle all that in one place (even though I don't foresee any special
handling when updating it). Keeping commit.cocci shorter is just a
side bonus.
-- 
Duy




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux