On Tue, Sep 28, 2021 at 07:40:14PM -0400, Jeff King wrote: > On Mon, Sep 27, 2021 at 11:46:40PM -0700, Elijah Newren wrote: > > > > * en/remerge-diff (2021-08-31) 7 commits > > > - doc/diff-options: explain the new --remerge-diff option > > > - show, log: provide a --remerge-diff capability > > > - tmp-objdir: new API for creating and removing primary object dirs > > > - merge-ort: capture and print ll-merge warnings in our preferred fashion > > > - ll-merge: add API for capturing warnings in a strbuf instead of stderr > > > - merge-ort: add ability to record conflict messages in a file > > > - merge-ort: mark a few more conflict messages as omittable > > > > > > A new presentation for two-parent merge "--remerge-diff" can be > > > used to show the difference between mechanical (and possibly > > > conflicted) merge results and the recorded resolution. > > > > > > Will merge to 'next'? > > > > It has been a month that it's been cooking with no issues brought up, > > and it's been in production for nearly a year... > > > > But just this morning I pinged peff and jrnieder if they might have > > time to respectively look at the tmp-objdir stuff (patch 5, plus its > > integration into log-tree.c in patch 7) and the ll-merge.[ch] changes > > (patch 3). I don't know if either will have time to do it, but > > perhaps wait half a week or so to see if they'll mention they have > > time? Otherwise, yeah, it's probably time to merge this down. > > Sorry to take so long. I think this is a very exciting topic, and I > appreciate being called into to look at tmp-objdir stuff, because it can > be quite subtle. > > I just left a rather long-ish mail in the thread, but the gist of it is > that I'm worried that there's some possibility of corruption there if > the diff code writes objects. I didn't do a proof-of-concept there, but > I worked one up just now. Try this: > > # make a repo > git init repo > cd repo > echo base >file > git add file > git commit -m base > > # two diverging branches > echo main >file > git commit -am main > git checkout -b side HEAD^ > echo side >file > git commit -am side > > # we get a conflict, and we resolve > git merge main > echo resolved >file > git commit -am merged > > # now imagine we had a file with a diff driver. I stuffed it > # in here after the fact, but it could have been here all along, > # or come as part of the merge, etc. > echo whatever >unrelated > echo "unrelated diff=foo" >.gitattributes > git add . > git commit --amend --no-edit > > # set up the diff driver not just to do a textconv, but to cache the > # result. This will require writing out new objects for the cache > # as part of the diff operation. > git config diff.foo.textconv "$PWD/upcase" > git config diff.foo.cachetextconv true > cat >upcase <<\EOF && > #!/bin/sh > tr a-z A-Z <$1 > EOF > chmod +x upcase > > # now show the diff > git log -1 --remerge-diff > > # and make sure the repo is still OK > git fsck > > The remerge diff will correctly show the textconv'd content (because > it's not in either parent, and hence an evil merge): > > diff --git a/unrelated b/unrelated > new file mode 100644 > index 0000000..982793c > --- /dev/null > +++ b/unrelated > @@ -0,0 +1 @@ > +WHATEVER > > but then fsck shows that our cache is corrupted: > > Checking object directories: 100% (256/256), done. > error: refs/notes/textconv/foo: invalid sha1 pointer 1e9b3ecffca73411001043fe914a7ec0e7291043 > error: refs/notes/textconv/foo: invalid reflog entry 1e9b3ecffca73411001043fe914a7ec0e7291043 > dangling tree 569b6e414203d2f50bdaafc1585eae11e28afc37 > > Now I'll admit the textconv-cache is a pretty seldom-used feature. And > that there _probably_ aren't a lot of other ways that the diff code > would try to write objects or references. But I think it speaks to the > kind of subtle interactions we should worry about (and certainly > corrupting the repository is a pretty bad outcome, though at least in > this case it's "just" a cache and we could blow away that ref manually). > > -Peff It seems to me that one problem is that the new "primary" objdir code doesn't disable ref updates since the GIT_QUARANTINE_ENVIRONMENT setting isn't set. If we fix that we should be forbidding ref updates. I've included a path that fixes my test case. This is on top of: https://lore.kernel.org/git/CANQDOddqwVtWfC4eEP3fJB4sUiszGX8bLqoEVLcMf=v+jzx19g@xxxxxxxxxxxxxx/ >From 38be7f6d31da8df3f205b35d25dd0a505aa75a8a Mon Sep 17 00:00:00 2001 From: Neeraj Singh <neerajsi@xxxxxxxxxxxxx> Date: Wed, 29 Sep 2021 11:24:05 -0700 Subject: [PATCH] tmp-objdir: disable ref updates when replacing the primary odb When creating a subprocess with a temporary ODB, we set the GIT_QUARANTINE_ENVIRONMENT env var to tell child Git processes not to update refs, since the tmp-objdir may go away. Introduce a similar mechanism for in-process temporary ODBs when we call tmp_objdir_replace_primary_odb. Peff's test case was invoking ref updates via the cachetextconv setting. That particular code silently does nothing when a ref update is forbidden Reported-by: Jeff King <peff@xxxxxxxx> Signed-off-by: Neeraj Singh <neerajsi@xxxxxxxxxxxxx> --- object-file.c | 7 +++++++ refs.c | 22 +++++++++++++++++++++- refs.h | 5 +++++ 3 files changed, 33 insertions(+), 1 deletion(-) diff --git a/object-file.c b/object-file.c index 990381abee..e3e5cec3c8 100644 --- a/object-file.c +++ b/object-file.c @@ -770,6 +770,12 @@ struct object_directory *set_temporary_primary_odb(const char *dir, int will_des new_odb->will_destroy = will_destroy; new_odb->next = the_repository->objects->odb; the_repository->objects->odb = new_odb; + + /* + * Disable ref updates while a temporary odb is active, since + * the objects in the database may roll back. + */ + refs_disable_updates(); return new_odb->next; } @@ -786,6 +792,7 @@ void restore_primary_odb(struct object_directory *restore_odb, const char *old_p the_repository->objects->odb = restore_odb; free_object_directory(cur_odb); + refs_enable_updates(); } /* diff --git a/refs.c b/refs.c index 8b9f7c3a80..98026b7341 100644 --- a/refs.c +++ b/refs.c @@ -19,6 +19,8 @@ #include "repository.h" #include "sigchain.h" +static int ref_update_disabled_count; + /* * List of all available backends */ @@ -1045,6 +1047,24 @@ void ref_transaction_free(struct ref_transaction *transaction) free(transaction); } +void refs_disable_updates(void) +{ + ++ref_update_disabled_count; +} + +void refs_enable_updates(void) +{ + if (!ref_update_disabled_count) + BUG("ref updates are not disabled"); + + --ref_update_disabled_count; +} + +int ref_updates_are_enabled(void) +{ + return !ref_update_disabled_count; +} + struct ref_update *ref_transaction_add_update( struct ref_transaction *transaction, const char *refname, unsigned int flags, @@ -2126,7 +2146,7 @@ int ref_transaction_prepare(struct ref_transaction *transaction, break; } - if (getenv(GIT_QUARANTINE_ENVIRONMENT)) { + if (getenv(GIT_QUARANTINE_ENVIRONMENT) || !ref_updates_are_enabled()) { strbuf_addstr(err, _("ref updates forbidden inside quarantine environment")); return -1; diff --git a/refs.h b/refs.h index 48970dfc7e..acd7e275c5 100644 --- a/refs.h +++ b/refs.h @@ -840,6 +840,11 @@ int ref_storage_backend_exists(const char *name); struct ref_store *get_main_ref_store(struct repository *r); +/* Helpers to mark updates to the refs as forbidden */ +void refs_disable_updates(void); +void refs_enable_updates(void); +int ref_updates_are_enabled(void); + /** * Submodules * ---------- -- 2.25.1