On Tue, Jun 8, 2021 at 11:32 PM Junio C Hamano <gitster@xxxxxxxxx> wrote: > > Elijah Newren <newren@xxxxxxxxx> writes: > > > On Mon, Jun 7, 2021 at 5:34 AM Derrick Stolee via GitGitGadget > > <gitgitgadget@xxxxxxxxx> wrote: > >> > >> From: Derrick Stolee <dstolee@xxxxxxxxxxxxx> > >> > >> While comparing an index to a tree, we may see a sparse directory entry. > >> In this case, we should compare that portion of the tree to the tree > >> represented by that entry. This could include a new tree which needs to > >> be expanded to a full list of added files. It could also include an > >> existing tree, in which case all of the changes inside are important to > >> describe, including the modifications, additions, and deletions. Note > >> that the case where the tree has a path and the index does not remains > >> identical to before: the lack of a cache entry is the same with a sparse > >> index. > >> > >> In the case where a tree is modified, we need to expand the tree > >> recursively, and start comparing each contained entry as either an > >> addition, deletion, or modification. This causes an interesting > >> recursion that did not exist before. > > > > So, I haven't read through this in detail yet...but there's a big > > question I'm curious about: > > > > Git already has code for comparing an index to a tree, a tree to a > > tree, or a tree to the working directory, right? So, when comparing a > > sparse-index to a tree...can't we re-use the compare a tree to a tree > > code when we hit a sparse directory? > > Offhand I do not think of a reason why that cannot work. > > The tree-diff machinery takes two trees, walks them in parallel and > repeatedly calls either diff_addremove() or diff_change(), which > appends diff_filepair() to the diff_queue[] structure. If you see > an unexpanded tree on the index side, you should be able to pass > that tree with the subtree you are comparing against to the tree-diff > machinery to come up with a series of filepairs, and then tweak the > pathnames of these filepairs (as such a two-tree comparison would be > comparing two trees representing a single subdirectory of two different > vintages) before adding them to the diff_queue[] you are collecting > the index-vs-tree diff, for example. Good to know it seems my idea might be reasonable. > But if a part of the index is represented as a tree because it is > outside the cone of interest, should we even be showing the > difference in that part of the tree? If t/ directory is outside the > cone of interest, should "git diff HEAD~100 HEAD t/" show anything > to begin with (the same question for "git diff --cached HEAD t/")? Excellent question...and not just for diff, but log, grep with revisions, and other commands. We discussed this a while back[1] and we seemed to lean towards eventually adding a flag because there are usecases both for (1) viewing full history while having sparsity paths restrict just the working copy, and (2) also restricting the view of history to the sparsity paths. [1] It's been discussed a few times, but there's a relatively comprehensive discussion at the "Commands that would change for behavior A" section from https://lore.kernel.org/git/CABPp-BGJ_Nvi5TmgriD9Bh6eNXE2EDq2f8e8QKXAeYG3BxZafA@xxxxxxxxxxxxxx/