On Sat, Apr 13, 2019 at 7:14 PM SZEDER Gábor <szeder.dev@xxxxxxxxx> wrote: > > On Fri, Apr 12, 2019 at 04:25:08PM +0200, Johannes Schindelin wrote: > > Hi, > > > > On Fri, 12 Apr 2019, SZEDER Gábor wrote: > > > > > On Thu, Apr 11, 2019 at 10:58:57PM +0200, SZEDER Gábor wrote: > > > > On Thu, Apr 11, 2019 at 10:51:46PM +0200, SZEDER Gábor wrote: > > > > > On Wed, Apr 10, 2019 at 10:56:52PM +0200, Johannes Schindelin wrote: > > > > > > > ...>} > > > > > > > > > > > > > > @@ > > > > > > > + expression c; > > > > > > > ++expression r; > > > > > > > + expression s; > > > > > > > + @@ > > > > > > > +-- get_commit_tree(c) = s > > > > > > > ++- repo_get_commit_tree(r, c) = s > > > > > > > + + c->maybe_tree = s > > > > > > > > > > > > I think this is wrong, and admittedly I had the very same version > > > > > > originally. > > > > > > > > > > > > When you have an arbitrary `r` in any `repo_get_commit_tree(r, c)` (as > > > > > > opposed to `the_repository`), the conversion to `c->maybe_tree` is most > > > > > > likely incorrect. > > > > > > > > > > > > Therefore, I don't think that we can do that. > > > > > > > > > > So, as far as I understand, the goal of these 'c->maybe_tree'-related > > > > > semantic patches is to prevent "generic" parts of Git from accessing > > > > > this field directly, as it might not be initialized in a > > > > > commit-graph-enabled repository. > > > > > > > > > > Only three functions are explicitly exempt, while this last semantic > > > > > patch in question implicitly allows a few more that assign a value to > > > > > 'c->maybe_tree'. These functions are release_commit_memory() and > > > > > parse_commit_buffer() in 'commit.c' and fill_commit_in_graph() in > > > > > 'commit-graph.c', > > > > > > ... and make_virtual_commit() in 'merge-recursive.c'. > > > > > > > and after a quick look these functions seem to be > > > > > rather fundamenal in the life-cycle of a commit object. > > > > > > > > Erm, not "commit object"; I meant the life-cycle of a 'struct commit' > > > > instance. > > > > > > > > > I think they deserve to be explicitly exempted, too, and then we could > > > > > remove this last semantic patch altogether. > > > > > > And it would look like this. Yeah, that's a very long line there, but > > > I don't think we can break it up. > > > > > > -- >8 -- > > > > > > diff --git a/contrib/coccinelle/commit.cocci b/contrib/coccinelle/commit.cocci > > > index 57c8f71479..fe814f313e 100644 > > > --- a/contrib/coccinelle/commit.cocci > > > +++ b/contrib/coccinelle/commit.cocci > > > @@ -10,20 +10,15 @@ expression c; > > > - c->maybe_tree->object.oid.hash > > > + get_commit_tree_oid(c)->hash > > > > > > -// These excluded functions must access c->maybe_tree direcly. > > > +// These excluded functions must access/modify c->maybe_tree direcly. > > > +// Note that if c->maybe_tree is written somewhere outside of these > > > +// functions, then the recommended transformation will be bogus with > > > +// repo_get_commit_tree() on the LHS. > > > @@ > > > -identifier f !~ "^(repo_get_commit_tree|get_commit_tree_in_graph_one|load_tree_for_commit)$"; > > > -expression c; > > > +identifier f !~ "^(repo_get_commit_tree|get_commit_tree_in_graph_one|load_tree_for_commit|fill_commit_in_graph|parse_commit_buffer|release_commit_memory|make_virtual_commit)$"; > > > > Hahahaha! That's *really* long. > > > > And a good indicator that this should be hidden in a single helper > > function (`set_commit_tree()`, file-local of course) that is exempted in > > the cocci patch. > > Note that this is not only about line length, and consider the slight > differences between the three approaches: > > - Currently only direct read accesses to 'c->maybe_tree' outside of > the listed functions are forbidden and transformed, but still any > function can set this field directly (thanks to the last semantic > patch in 'commit.cocci'). > > - Encapsulating writes in set_commit_tree() and adding this function > to that list would prevent other functions from setting > 'c->maybe_tree' directly, but still any function could set it > indirectly by calling set_commit_tree(). > > - With that loooong line only those few listed special functions > would be able read or write 'c->maybe_tree'. > > Does the additional restriction of the long line variant bring us > benefits? Well, not sure. > > The root tree of a commit is needed in many places, and in the past we > got used to it being always initialized in any 'struct commit' > instance. However, with the commit graph that's not the case anymore, > and any read accesses to the uninitialized root tree object would have > bad consequences. That's why there is get_commit_tree() helper > performing lazy-initialization, and the protection from direct reads > in the form of the semantic patch. And I think set_commit_tree() is a good way to go. Basically "maybe_tree" is not safe to be manipulated directly because of traps and pitfalls. If we have one entry point to update maybe_tree, we can handle all that in one place (even though I don't foresee any special handling when updating it). Keeping commit.cocci shorter is just a side bonus. -- Duy