On 2/1/2021 3:22 PM, Elijah Newren wrote: > On Mon, Jan 25, 2021 at 9:42 AM Derrick Stolee via GitGitGadget > <gitgitgadget@xxxxxxxxx> wrote: >> >> From: Derrick Stolee <dstolee@xxxxxxxxxxxxx> >> >> This giant patch is not intended for actual review. I have a branch that >> has these changes split out in a sane way with some commentary in each >> file that is modified. >> >> The idea here is to guard certain portions of the codebase that do not >> know how to handle sparse indexes by ensuring that the index is expanded >> to a full index before proceeding with the logic. >> >> This also provides a good mechanism for testing which code needs >> updating to enable the sparse index in a Git builtin. The builtin can >> set the_repository->settings.command_requires_full_index to zero and >> then we can debug the command with a breakpoint on ensure_full_index(). >> That identifies the portion of code that needs adjusting before enabling >> sparse indexes for that command. >> >> Some index operations must be changed to operate on a non-const pointer, >> since ensuring a full index will modify the index itself. >> >> There are likely some gaps to these protections, which is why it will be >> important to carefully test each scenario as we relax the requirements. >> I expect that to be a long effort. > > I think the idea makes sense; it provides a way for us to > incrementally build support for this new feature. > > I skimmed over the code and noticed various interesting places that > had the ensure_full_index() call (e.g. > read_skip_worktree_file_from_index() -- whose existence comes from > sparsity; what irony...). Better breakouts would be great, so I'll > defer commenting much until then. But, just to verify I'm > understanding: the primary defence is the command_requires_full_index > setting, and you have added several ensure_full_index() calls > throughout the code in places you believe would need to be fixed up in > case someone switches the command_requires_full_index setting. Is > that correct? And your comment on the gaps is just that there may be > other places that are missing the secondary protection (as opposed to > my first reading of that paragraph as suggesting we aren't sure if we > have enough protections yet and need to add more before this moves out > of RFC); is that right? Yes, the idea is that we can incrementally enable command_requires_full_index for some builtins and be confident that corner cases will be protected by ensure_full_index(). Further, we can test whether ensure_full_index() was called using test_region in test scripts to demonstrate that a command is truly "sparse aware" or if it is converting to full and back to sparse. There is also the case that when we write the index into a sparse format, the in-memory structure is modified. If the index is re-used afterwards, then we must expand to full again for these code paths. unpack_trees() already has one of these calls because it was necessary for the sparse-index write to work. The ensure_full_index() pattern also works when updating a builtin to work with the sparse-index because of the breakpoint trick. When I submit this as a full series, this patch will be one full patch series submission with careful comments about why each of these is added on a file-by-file basis. Thanks, -Stolee