Re: What's cooking in git.git (Sep 2021, #08; Mon, 27)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Sep 30, 2021 at 1:16 AM Jeff King <peff@xxxxxxxx> wrote:
>
> On Wed, Sep 29, 2021 at 11:43:39AM -0700, Neeraj Singh wrote:
>
> > It seems to me that one problem is that the new "primary" objdir code doesn't
> > disable ref updates since the GIT_QUARANTINE_ENVIRONMENT setting isn't set.
> > If we fix that we should be forbidding ref updates.
> >
> > I've included a path that fixes my test case. This is on top of:
> > https://lore.kernel.org/git/CANQDOddqwVtWfC4eEP3fJB4sUiszGX8bLqoEVLcMf=v+jzx19g@xxxxxxxxxxxxxx/
>
> Ah, right, I forgot we had that "forbid ref updates in quarantine" magic
> (despite being the one who added it).
>
> I do think this improves the safety, but things are still a bit more
> dangerous because we're handling it all in a single process, which sees
> both the quarantine and non-quarantine states. I wrote more details in
> this reply a few minutes ago:
>
>   https://lore.kernel.org/git/YVVmssXlaFM6yD5W@xxxxxxxxxxxxxxxxxxxxxxx/
>
> (sorry, it's long; search for the paragraph starting with "Whereas doing
> it in-process").
>
> > When creating a subprocess with a temporary ODB, we set the
> > GIT_QUARANTINE_ENVIRONMENT env var to tell child Git processes not
> > to update refs, since the tmp-objdir may go away.
> >
> > Introduce a similar mechanism for in-process temporary ODBs when
> > we call tmp_objdir_replace_primary_odb.
> >
> > Peff's test case was invoking ref updates via the cachetextconv
> > setting. That particular code silently does nothing when a ref
> > update is forbidden
>
> Oh good. I was worried that it would generate ugly errors, rather than
> silently skipping the update.
>
> > @@ -2126,7 +2146,7 @@ int ref_transaction_prepare(struct ref_transaction *transaction,
> >               break;
> >       }
> >
> > -     if (getenv(GIT_QUARANTINE_ENVIRONMENT)) {
> > +     if (getenv(GIT_QUARANTINE_ENVIRONMENT) || !ref_updates_are_enabled()) {
> >               strbuf_addstr(err,
> >                             _("ref updates forbidden inside quarantine environment"));
> >               return -1;
>
> I think the overall goal of this patch makes sense, but this
> conditional, along with tmp-objdir reaching out to the refs code, feels
> funny. Should we perhaps have a single:
>
>   int tmp_objdir_is_primary(struct repository *r)
>   {
>         if (the_tmp_objdir &&
>             !strcmp(the_tmp_objdir->path.buf, r->objects->odb->path))
>                 return 1; /* our temporary is the primary */
>
>         if (getenv(GIT_QUARANTINE_PATH))
>                 return 1; /* our caller put us in quarantine */
>
>         return 0;
>   }
>
> Then it's all nicely abstracted and the ref code does not have to know
> the details of GIT_QUARANTINE_PATH in the first place.
>
> If you do got that route, the strcmp() might need to be a little more
> careful about whether r->objects can be NULL (I didn't check).
> Alternatively, I kind of wonder if "struct object_directory" should just
> have a flag that says "is_temporary".

Actually, wouldn't this be the safest approach, for my particular
usecase?  By having the quarantine in place, no refs will be updated,
which removes the risk of new refs mentioning objects that will go
away.  The only issue that could arise would be from new objects
referencing objects that will go away.  But the new objects are also
written to the temporary object store when it remains the primary, and
thus those new objects will also be deleted when the temporary object
store is.

In contrast, moving the temporary object store to the end of the
alternates list would allow new objects to be written that reference
the objects that will go away.  Using pretend_object_file() will also
allow new objects to be written that reference the objects that will
go away.

Said another way, I don't think anything should be writing a critical
file that needs to be durable when we're in the middle of a
"read-only" process like git-log.  The only things written should be
temporary stuff, like the automatic remerge calculation from
merge-ort, the textconv cache optimization stuff, or perhaps future
gitattributes transformation caching.  All that stuff can safely be
blown away at the completion of each merge.



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux