On Wed, Mar 18, 2020 at 12:55 PM Junio C Hamano <gitster@xxxxxxxxx> wrote: > > Jonathan Tan <jonathantanmy@xxxxxxxxxx> writes: > > >> Hmph, what was it called earlier? My gut reaction without much > >> thinking finds --no-skip-* a bit confusing double-negation and > >> suspect "--[no-]detect-cherry-pick" (which defaults to true for > >> backward compatibility) may feel more natural, but I suspect (I do > >> not recall details of the discussion on v1) it has been already > >> discussed and people found --no-skip-* is OK (in which case I won't > >> object)? > > > > It was earlier called "--{,no-}skip-already-present" (with the opposite > > meaning, and thus, --skip-already-present is the default), so the double > > negative has always existed. "--detect-cherry-pick" might be a better > > idea...I'll wait to see if anybody else has an opinion. > > While "--[no-]detect-cherry-pick" is much better in avoiding double > negation, it is a horrible name---we do not tell the users what we > do after we detect cherry pick ("--[no-]skip-cherry-pick-detection" > does not tell us, either). I like --[no-]detect-cherry-pick. I'm on board with using "keep" instead of "skip" to avoid double negation. > Compared to them, "--[no-]skip-already-present" is much better, even > though there is double negation. This one seems especially bad from a discoverability and understandability viewpoint though. It's certainly nice if options are fully self-documenting, but sometimes that would require full paragraphs for the option name. Focusing on what is done with the option at the expense of discovering which options are relevant to your case or at the expense of enabling users to create a mental model for when options might be meaningful is something that I think is very detrimental to usability. I think users who want such an option would have a very hard time finding this based on its name, and people who want completely unrelated features would be confused enough by it that they feel compelled to read its description and attempt to parse it and guess how it's related. In contrast, --[no-]detect-cherry-pick is a bit clearer to both groups of people for whether it is useful, and the group who wants it can read up the description to get the details. > How about a name along the lines of "--[no-]keep-duplicate", then? This name is much better than --[no-]keep-already-present would be because "duplicate" is a far better indicator than "already-present" of the intended meaning. But I'm still worried the name "duplicate" isn't going to be enough of a clue to individuals about whether they will need this options or not. Perhaps --[no-]keep-cherry-pick? > >> I also wonder if --detect-cherry-pick=(yes|no|auto) may give a > >> better end-user experience, with "auto" meaning "do run patch-ID > >> based filtering, but if we know it will be expensive (e.g. the > >> repository is sparsely cloned), please skip it". That way, there > >> may appear other reasons that makes patch-ID computation expensive > >> now or in the fiture, and the users are automatically covered. > > > > It might be better to have predictability, and for "auto", I don't know > > if we can have a simple and explainable set of rules as to when to use > > patch-ID-based filtering - for example, in a partial clone with no > > blobs, I would normally want no patch-ID-based filtering, but in a > > partial clone with only a blob size limit, I probably will still want > > patch-ID-based filtering. > > Perhaps. You could have something more specific than "auto". The > main point was instead of "--[no-]$knob", "--$knob=(yes|no|...)" is > much easier to extend. I simply do not know if we will see need to > extend the vocabulary in the near future (to which you guys who are > more interested in sparse clones would have much better insight than > I do). I also struggle to understand when auto would be used. But beyond that, I'm still a little uneasy with where we seem to be ending up (even if no fault of this patch): 1) Behavior has long been --keep-cherry-pick, and in various cases that behavior can reduce conflicts users have to deal with. 2) Both Junio and I independently guessed that the cherry-pick detection logic is poorly performing and could be improved; Jonathan confirmed with some investigative work. We've all suggested punting for now, though. 3) I think we can make the sequencer machinery fast enough that the cherry-pick detection is going to be the slowest part by a good margin even in normal cases, not just sparse clones or the cases Taylor or I had in mind. So I think it's going to stick out like a sore thumb for a lot more people (though maybe they're all happy because it's faster overall?). 4) Jonathan provided some good examples of cases where the --keep-cherry-pick behavior isn't just slow, but leads to actually wrong answers (a revert followed by an un-revert). I particularly don't like the idea of something being the default when it can both cause wrong behavior and present a huge performance problem that folks have to learn to workaround, especially when based only on the tradeoff of sometimes reducing the amount of work we push back on the user. Maybe that's just inevitable, but does anyone have any words that will make me feel better about this? Elijah