Re: [PATCH V2] git-apply: Allow simultaneous --cached and --3way options

Jerry Zhang <jerry@xxxxxxxxxx> · Tue, 6 Apr 2021 14:56:42 -0700

On Mon, Apr 5, 2021 at 10:52 PM Junio C Hamano <gitster@xxxxxxxxx> wrote:
>
> Jerry Zhang <jerry@xxxxxxxxxx> writes:
>
> > Thanks for the comments! I've updated v3 with the changes. Let me know
> > if you have any
> > more thoughts on whether to block / warn the user before clobbering their cache.
>
> Please do not top-post on this list.
>
> I've already said that I think we should ensure the index is clean
> by default, because, unlike the case where the application is done
> on the working tree files, the use of "--cached" is a sign that the
> next step is likely to write a tree out.  As I've already said so in
> earlier reviews, there is nothing more from me to add on that issue.
Understood, but please bear with me to explain the risks a bit more. I'm
having some difficulty coming up with a name and explanation for flags
for this case, because I don't completely understand the safety issue
we are trying to mitigate.

Let me enumerate some behaviors in 3 different cases where the user
has "file.txt" changes staged in the index, so index differs from HEAD.

"git apply --cached" would either 1. combine the patch and cached version
and put that in the cache or 2. do nothing (patch failed). In 2 nothing happened
so the user's changes are safe. In 1 the user's changes may be gone, but
since the user was forewarned, this is presumably what they wanted.

"git apply --3way" would either 1. apply cleanly to working dir or
2. conflict, in which case user's changes would be moved to stage #2
in cache. For 1 the user's changes are in the cache, so they can check that out
to restore the original state, since this invocation requires the cache
and working dir to match. For 2, the user's changes are moved to cache
in stage #2. Although the changes are preserved, there doesn't seem to
be any atomic way to move a cache entry from stage #2 to stage #0.
Something like "git restore --staged --ours file.txt" seems like it should
work, but "git restore" doesn't allow combining those flags.
The non atomic way we can do is "git checkout --ours file.txt &&
git add file.txt", this is ok in this case since we've required the index
and working tree to match.

"git apply --3way --cached" would either 1. apply cleanly to the cache or
2. conflict, and the user's changes are moved to stage #2. In 1, the user's
changes are lost because they're combined with the patch, but this is
the same as the "--cached" case by itself. In 2, the user's changes are
preserved in stage #2 similar to "--3way" by itself. What's somewhat
tricky here is restoring it to stage #0 since we can't use the working
tree, but I think that is more of a limitation in "git restore", since moving
a cache entry from stage #2 to stage #0 is a conceptually possible and
simple operation.

In summary it seems to me that merge or no merge, the safety semantics
for "--3way" + "--cached" as it is are pretty similar to the existing semantics
for those options individually. The user could be preparing to write
a tree out in either the "--cached" or the "--cached --3way" operation
so I don't understand why those must differ in safety. In addition, the
both "--3way" and "--3way --cached" perform mergey operations that
changes the stages of a file in cache, so I don't understand why those
must differ in safety either.

>
> >> Give an order to the codebase to "be like so".  Here is my attempt.
> >>
> >>     Teach "git apply" to accept "--cached" and "--3way" at the same
> >>     time.  Only when all changes to all paths involved in the
> >>     application auto-resolve cleanly, the result is placed in the
> >>     index at stage #0 and the command exits with 0 status.  If there
> >>     is any path whose conflict cannot be cleanly auto-resolved, the
> >>     original contents from common ancestor (stage #1), our version
> >>     (stage #2) and the contents from the patch (stage #3) for the
> >>     conflicted paths are left at separate stages without any attempt
> >>     to resolve the conflict at the content level, and the command
> >>     exists with non-zero status, because there is no place (like the
> >>     working tree files) to leave a half-resolved conflicted merge
> >>     result to ask the end-user to resolve.
>
> I wrote the above as an example to illustrate the tone and the level
> of details expected in our proposed commit log message.  The
> behaviour it describes may not necessarily match what you have
> implemented in the patch.
>
> For example, imagine that we are applying a patch for two paths,
> where one auto-resolves cleanly and the other does not.  The above
> description expects both paths will leave the higher stages (instead
> of recording the auto-resolved path at stage #0, and leaving the
> other path that cannot be auto-resolved at higher stages) and the
> command exits with non-zero status, which may not be what you
> implemented.  As an illustration, I didn't necessarily mean such an
> all-or-none behaviour wrt resolving should be what we implement---I
> do not want to choose, as this is your itch and I want _you_ with
> the itch to think long and hard before deciding what the best design
> for end-users would be, and present it as a proposed solution.  An
> obvious alternative is to record auto-resolved paths at stage #0 and
> leave only the paths for which auto-resolution failed in conflicted
> state.
I missed the "all changes to all paths" requirement in that description,
I'll update it to be more consistent with what it actually does. As you say,
the leaving entries at higher orders behavior only happens for conflicting
paths, not for all paths.
>
> Thanks.