Re: [PATCH 1/1] unpack-trees: exit check_updates() early if updates are not wanted

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Junio,

On Sat, Jan 4, 2020 at 3:48 PM Junio C Hamano <gitster@xxxxxxxxx> wrote:
>
> "Elijah Newren via GitGitGadget" <gitgitgadget@xxxxxxxxx> writes:
>
> > +     if (!o->update || o->dry_run) {
> > +             remove_marked_cache_entries(index, 0);
> > +             trace_performance_leave("check_updates");
> > +             return 0;
> > +     }
>
> OK, so the idea is that bunch of codepaths we see later are
> protected by "o->update && !o->dry_run" in the original and are not
> executed, so we can leave early here.
>
> So the primary task of the reviewers of this patch is to see if the
> remainder of the function has some code that were *not* no-op when
> (!o->update || o->dry_run) is true in the current code, which would
> indicate possible behaviour change, and assess if the change in
> behaviour matters in the real world if there is.
>
> >       if (o->clone)
> >               setup_collided_checkout_detection(&state, index);
>
> If there were such a thing as dry-run of a clone, we will stop
> calling the report based on the thing we are setting up here?
> Presumably that does not happen in the current code---is that
> something we guarantee in the future evolution of the code, though?

setup_collided_checkout_detection() clears the CE_MATCHED flag for
each cache entry, which can be updated via way of checkout_entry()'s
call to mark_colliding_entries().  Then the trailing
report_collided_checkout() at the end of the function will report any
paths with CE_MATCHED set.  However, when we're in a dry-run, all
checkout_entry() calls are skipped, so we won't detect ever detect any
collided entries.  As such, there's no point in calling either the
setup_collided_checkout_detection() or report_collided_checkout()
funtions.

Sorry for forgetting to include this in the commit message.

> >       progress = get_progress(o);
>
> get_progress() would yield NULL when !o->update, but o->dry_run does
> not influence it, so we would have called the progress API in
> today's code, if o->dry_run and o->update are both true.
>
> Presumably, o->update and o->dry_run are not true at the same time
> in the current code---but even if in the future we start supporting
> the combination, dry-run should be skipping the filesystem update
> (which is the slow part) and lack of progress may not matter, I
> guess?

The point of the progress display was to show how long it takes to do
the updates.  For a dry-run, that time is zero because we skip the
updates.  And if we're skipping the updates, shouldn't we also skip
measuring how long it takes to do them?

Yeah, I should have included this in the commit message too.

> It seems to me that unpack_trees_options.dry_run can become true
> only in "git read-tree" when the -n (or --dry-run) option is given
> and no other codepath touches it.  Am I reading the code correctly?

I hadn't checked previously, but that looks right to me.

> Similarly, unpack_trees_options.clone is turned on only in the
> builtin/clone.c::checkout().  It _might_ occur to future developers
> that "git clone --no-checkout" is better implemented by actually
> going through builtin/clone.c::checkout() with .dry_run turned on,
> instead of leaving that function early.  That would for example
> allow collided_checkout() detection to still be done---in such a
> future, is the optimization this patch makes still safe, I wonder?

If it is intended that a report of colliding paths be given when doing
a dry-run using unpack_trees, then the current code is quite buggy
because it'll never report anything.  In order to report anything, the
mechanism for detecting them would need to be drastically reworked.
As such, I'd view the optimization as at least making it more obvious
why the code won't report collisions with dry-run rather than
pretending it will.  Or are you saying that I've just uncovered a bug
and it should really be fixed rather than exploited for an
optimization?

> > -     if (o->update)
> > -             git_attr_set_direction(GIT_ATTR_CHECKOUT);
> > +     git_attr_set_direction(GIT_ATTR_CHECKOUT);
> >
> > -     if (should_update_submodules() && o->update && !o->dry_run)
> > +     if (should_update_submodules())
> >               load_gitmodules_file(index, NULL);
>
> Good (no behaviour change because this wouldn't have been done under
> the early-exit condition anyway).
> >
> >       for (i = 0; i < index->cache_nr; i++) {
> > @@ -388,18 +393,18 @@ static int check_updates(struct unpack_trees_options *o)
> >
> >               if (ce->ce_flags & CE_WT_REMOVE) {
> >                       display_progress(progress, ++cnt);
> > -                     if (o->update && !o->dry_run)
> > -                             unlink_entry(ce);
> > +                     unlink_entry(ce);
>
> Good (no behaviour change because this wouldn't have been done under
> the early-exit condition anyway).
>
> >               }
> >       }
> > +
> >       remove_marked_cache_entries(index, 0);
> >       remove_scheduled_dirs();
> >
> > -     if (should_update_submodules() && o->update && !o->dry_run)
> > +     if (should_update_submodules())
> >               load_gitmodules_file(index, &state);
>
> Good (no behaviour change because this wouldn't have been done under
> the early-exit condition anyway).
> >
> >       enable_delayed_checkout(&state);
> > -     if (has_promisor_remote() && o->update && !o->dry_run) {
> > +     if (has_promisor_remote()) {
>
> Good (no behaviour change because this wouldn't have been done under
> the early-exit condition anyway).
>
> >               /*
> >                * Prefetch the objects that are to be checked out in the loop
> >                * below.
> > @@ -431,15 +436,12 @@ static int check_updates(struct unpack_trees_options *o)
> >                                   ce->name);
> >                       display_progress(progress, ++cnt);
> >                       ce->ce_flags &= ~CE_UPDATE;
> > -                     if (o->update && !o->dry_run) {
> > -                             errs |= checkout_entry(ce, &state, NULL, NULL);
> > -                     }
> > +                     errs |= checkout_entry(ce, &state, NULL, NULL);
>
> Good (no behaviour change because this wouldn't have been done under
> the early-exit condition anyway).
>
> >               }
> >       }
> >       stop_progress(&progress);
> >       errs |= finish_delayed_checkout(&state, NULL);
> > -     if (o->update)
> > -             git_attr_set_direction(GIT_ATTR_CHECKIN);
> > +     git_attr_set_direction(GIT_ATTR_CHECKIN);
> >
> >       if (o->clone)
> >               report_collided_checkout(index);
>
> Behaviour around this one (and the corresponding setup) may make a
> difference before and after the patch to future developers (who may
> need to revert this change to achieve what they want to do), but I
> think it is a no-op clean-up for today's code.



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux