On Tue, May 18, 2021 at 7:57 AM Derrick Stolee <stolee@xxxxxxxxx> wrote: > > On 5/17/2021 9:33 PM, Elijah Newren wrote: > > On Fri, May 14, 2021 at 11:31 AM Derrick Stolee via GitGitGadget > > <gitgitgadget@xxxxxxxxx> wrote: > >> > >> From: Derrick Stolee <dstolee@xxxxxxxxxxxxx> > >> > >> When creating a full index from a sparse one, we create cache entries > >> for every blob within a given sparse directory entry. These are > >> correctly marked with the CE_SKIP_WORKTREE flag, but they must also be > >> marked with the CE_EXTENDED flag to ensure that the skip-worktree bit is > >> correctly written to disk in the case that the index is not converted > >> back down to a sparse-index. > > > > This seems odd to me. When sparse-index is not involved and we are > > just doing simple sparse checkouts, do we mark CE_SKIP_WORKTREE > > entries with CE_EXTENDED? I can't find any code that does so. > > > > Is it possible that the setting of CE_EXTENDED is just a workaround > > that happens to force the index to be written in cases where the logic > > is otherwise thinking it can get away without one? Or is there > > something I'm missing about why the CE_EXTENDED flag is actually > > needed here? > > This is happening within the context of ensure_full_index(), so we > are creating new cache entries and want to mimic what they would > look like on-disk. Something within do_write_index() discovers that > since CE_SKIP_WORKTREE is set, then also CE_EXTENDED should be set > in order to ensure that the on-disk representation has enough room > for the CE_SKIP_WORKTREE bit. Yeah, I think it's this part: /* reduce extended entries if possible */ cache[i]->ce_flags &= ~CE_EXTENDED; if (cache[i]->ce_flags & CE_EXTENDED_FLAGS) { extended++; cache[i]->ce_flags |= CE_EXTENDED; } > > I suppose this might not have a meaningful purpose other than when > I compare a full index against an expanded sparse-index and check > if their flags match. Ah, you're just setting this flag in advance of do_write_index() being called so that you can compare in memory values and check they match without doing a write-to-disk-and-read-back cycle. Makes sense, but it'd be nice to see this in the commit message.