Re: [PATCH 00/10] [RFC] In-tree sparse-checkout definitions

Derrick Stolee <stolee@xxxxxxxxx> · Wed, 17 Jun 2020 21:42:54 -0400

On 6/17/2020 7:14 PM, Elijah Newren wrote:
> Hi,
> 
> Another late addition...
> 
> On Thu, May 7, 2020 at 6:20 AM Derrick Stolee via GitGitGadget
> <gitgitgadget@xxxxxxxxx> wrote:
> 
>> IN-TREE SPARSE-CHECKOUT DEFINITIONS
>> ===================================
>>
>> Minh's idea was simple: have sparse-checkout files in the working directory
>> and use config to point to them. As these in-tree files update, we can
>> automatically update the sparse-checkout definition accordingly. Now, the
>> only thing to do would be to ensure that the sparse-checkout files are
>> updated when someone updates the build definitions. This requires some extra
>> build validation, but would not require special tools built on every client.
> 
> "In-tree" still bugs me after a few weeks; the wording seems slightly
> awkward.  I don't have a good suggestion, but I'm curious if there's a
> better term.

I am open to suggestions. It reminds me of the two hardest problems
in software engineering:

	1. concurrency
	2. naming things
	3. off-by-one errors

> But I really came here to comment on another issue I think I glossed
> over the first time around.  I'm curious if all module definition
> files have to exist in the working directory, as possibly suggested
> above, or if we can allow them to just exist in the index.  To give
> you a flavor for what I mean, with my sparsify tool people can do
> things like:
>     ./sparsify --modules MODULE_A
> which provides MODULE_A and it's dependencies while removing all other
> directories.  If MODULE_B, is not a dependency (direct or transitive)
> of MODULE_A, it will not exist in the working directory after this
> step.  Our equivalent of the "in-tree" definition of MODULE_B exists
> *in* the directory for MODULE_B, because it seems to make sense for
> us.  I want people to be able to do
>     ./sparsify --modules MODULE_B
> and have it correctly check out all the necessary files even though
> the definition of MODULE_B wasn't even in the working directory at the
> time the command ran.  (The sparsify script knows to check the working
> directory first, then fall back to the index).

I think one tricky part of my RFC is that it _only_ looks at the
index. This allows us to read the contents even when the files are
not part of the current sparse-checkout definition.

You mentioned in another thread that it is a bit unwieldy for a user
to rely on a committed (or staged?) file, so adding the ability to
check the working directory first is interesting. I wonder how the
timing comes into play when changing HEAD to a new commit? Seems
tricky, but solvable.

Thanks,
-Stolee