On 6/9/2021 4:11 AM, Elijah Newren wrote: > On Tue, Jun 8, 2021 at 11:32 PM Junio C Hamano <gitster@xxxxxxxxx> wrote: >> >> Elijah Newren <newren@xxxxxxxxx> writes: >> >>> On Mon, Jun 7, 2021 at 5:34 AM Derrick Stolee via GitGitGadget >>> <gitgitgadget@xxxxxxxxx> wrote: >>>> >>>> From: Derrick Stolee <dstolee@xxxxxxxxxxxxx> >>>> >>>> While comparing an index to a tree, we may see a sparse directory entry. >>>> In this case, we should compare that portion of the tree to the tree >>>> represented by that entry. This could include a new tree which needs to >>>> be expanded to a full list of added files. It could also include an >>>> existing tree, in which case all of the changes inside are important to >>>> describe, including the modifications, additions, and deletions. Note >>>> that the case where the tree has a path and the index does not remains >>>> identical to before: the lack of a cache entry is the same with a sparse >>>> index. >>>> >>>> In the case where a tree is modified, we need to expand the tree >>>> recursively, and start comparing each contained entry as either an >>>> addition, deletion, or modification. This causes an interesting >>>> recursion that did not exist before. >>> >>> So, I haven't read through this in detail yet...but there's a big >>> question I'm curious about: >>> >>> Git already has code for comparing an index to a tree, a tree to a >>> tree, or a tree to the working directory, right? So, when comparing a >>> sparse-index to a tree...can't we re-use the compare a tree to a tree >>> code when we hit a sparse directory? >> >> Offhand I do not think of a reason why that cannot work. >> >> The tree-diff machinery takes two trees, walks them in parallel and >> repeatedly calls either diff_addremove() or diff_change(), which >> appends diff_filepair() to the diff_queue[] structure. If you see >> an unexpanded tree on the index side, you should be able to pass >> that tree with the subtree you are comparing against to the tree-diff >> machinery to come up with a series of filepairs, and then tweak the >> pathnames of these filepairs (as such a two-tree comparison would be >> comparing two trees representing a single subdirectory of two different >> vintages) before adding them to the diff_queue[] you are collecting >> the index-vs-tree diff, for example. > > Good to know it seems my idea might be reasonable. I agree that this is reasonable. I just didn't look hard enough to find existing code for this, since I found traverse_trees and thought that _was_ the library for this. >> But if a part of the index is represented as a tree because it is >> outside the cone of interest, should we even be showing the >> difference in that part of the tree? If t/ directory is outside the >> cone of interest, should "git diff HEAD~100 HEAD t/" show anything >> to begin with (the same question for "git diff --cached HEAD t/")? > > Excellent question...and not just for diff, but log, grep with > revisions, and other commands. We discussed this a while back[1] and > we seemed to lean towards eventually adding a flag because there are > usecases both for (1) viewing full history while having sparsity paths > restrict just the working copy, and (2) also restricting the view of > history to the sparsity paths. > > [1] It's been discussed a few times, but there's a relatively > comprehensive discussion at the "Commands that would change for > behavior A" section from > https://lore.kernel.org/git/CABPp-BGJ_Nvi5TmgriD9Bh6eNXE2EDq2f8e8QKXAeYG3BxZafA@xxxxxxxxxxxxxx/ Yes, we could investigate this behavior change in the future. The good thing is that these points that handle sparse directory entries create clear branching points for that future behavior change. Thanks, -Stolee