On 12/11/2019 4:37 PM, Junio C Hamano wrote: > Derrick Stolee <stolee@xxxxxxxxx> writes: > >> The specific example I have in my mind is that the filesystem can have >> "README.TXT" as what it would report by readdir(), but a user can type >> >> git add readme.txt >> >> Without core.ignoreCase, the index will store the path as "readme.txt", >> possibly adding a new entry in the index next to "README.TXT". If >> core.ignoreCase is enabled, then the case reported by readdir() is used. >> (This works both for a tracked and untracked path.) > > Yes, but which one is "correct"? readme.txt read from the index or > README.TXT read from the filesystem? Presumably, when the paths got > first checked out of the index, it would have been in "readme.txt" > on the filesystem, so the user must have done on purpose something > to cause the file to be named in all uppercase since then. Ah. The issue here is that with core.ignoreCase=true, the index and filesystem agree. It's just the user input that differs. >> This is a case that is difficult to control for. We have no source >> of truth (filesystem or index) at the time of the 'git sparse-checkout >> set' command. I cannot think of a solution to this specific issue >> without doing the more-costly approach on every unpack_trees() call. >> >> I believe this case may be narrow enough to "document and don't-fix", >> but please speak up if anyone thinks this _must_ be fixed. > > I do not think it "must be" fixed in the sense that "if it hurts, > don't do so" is a sufficient remedy. But then for exactly the same > reason, I do not think it is worth doing what you do in this patch. > > On the other hand, I think runtime case folding, just like what we > do when "git add" is told to handle a path in different case, would > be the only "right" way to match end-user expectations on a case > insensitive system, even though that is a "nice to do so" and > certainly not a "must do so" thing. > >> The "git add" behavior made me think there was sufficient precedent >> in Git to try this change. > > Since I view that the "git add" behaviour is a "nice to do so" > runtime conversion, I would actually think the approach you > discarded would be more in line with the precedent. > >> I'm happy to follow the community's opinion in this matter, including >> a hard "we will not correct for case in this feature" or "if you want >> this then you'll pay for it in performance." In the latter case I'd >> prefer a new config option to toggle the sparse-checkout case >> insensitivity. That way users could have core.ignoreCase behavior without >> the perf hit if they use clean input to the sparse-checkout feature. > > I value correctness more---a system that does incorrect thing > quickly is not all that interesting. > > Assuming that your users are sensible and feed good data, how much > "performance hit" are we talking about? I'll need to build a prototype to test the performance hit. Maybe I'm overestimating the effect of using a case-insensitive hash. > Wouldn't this be something > we can make into a "if you have the paths in the correct case, we'll > see the match just as quickly as in the case sensitive system, so > most of the time there is no penalty, but for correctness we would > also fall back to try different cases"? I think we would want the config option present to say "my data may not be in the proper case, please do extra work for me". I can't think of a way to do the fallback in real-time. I'll try again with the case-insensitive hash algorithm and see how that shakes out. Thanks, -Stolee