On Fri, Mar 15, 2024 at 08:16:53AM +0100, Kristoffer Haugsbakk wrote: > > Mostly I was worried that people would take "char" in the name to assume > > it could only be a single byte (I had originally even started the new > > sentence with "Despite the word 'char' in the name, this option > > can..."). And that is not just history, but a name we are stuck with > > forever[1]. > > Missing footnote or referring to my footnote? > > My suggestion was to use a `core.commentString` alias. Which might > matter for new answers to questions about its use. It might not matter > if in practice most people get their config tips from 1500 point > StackOverflow question about how git-commit(1) keeps swallowing their > GitHub issue numbers (due to automatic linewrap) from 2011. Heh, missing footnote. I was going to say "we could introduce core.commentStr or similar", but after your comment I searched in the archive and see that you did indeed already suggest it. I'm not sure if it would make things more or less confusing to have two related values. One nice side effect is that the new variable would be ignored by older versions of Git (whereas by extending core.commentChar, you end up with config that causes older versions to barf). That probably doesn't matter that much for most users, but as somebody who works on Git I frequently run old versions for bug testing, bisection, and so forth. > > I actually do think the "string" nature is mostly uninteresting, and I'd > > be OK leaving it as an easter egg. > > To my mind a string subsumes a char (multi- or not). Like in programming > languages: some might be used to single-char `#`, but I don’t think they > do a double take when they see languages with `//` or `--`. Hmm, good point. I was mostly focused on UTF-8 characters, but "//" is quite a reasonable thing for people to try. It is probably a better example than "foo". > > What your suggestion doesn't say is that multi-byte characters are > > OK. But if we think people will just assume that in a modern UTF-8 > > world, then maybe we don't need to say anything at all? > > Given that we’re mostly in the context of a commit message, an > ASCII-only restriction would feel archaic. > > I guess it depends on what the *normal* is in the documentation at > large. As a user I’m used to Git handling the text that I give it. Right, that's what I was asking. To me "character" means an ASCII byte, but I think I might be archaic myself. ;) If most of our readers would just assume that multi-byte characters work, perhaps it is confusing things to even mention it. > > It actually does not have to be UTF-8. > > Good point. Unicode is more appropriate. I think other Unicode encodings are likely to have problems (because they embed NULs). Specifically I was thinking that you could probably get away with latin1 or other 8-bit encodings. But again, I really hope nobody is doing that anymore. So anyway, adapting your original suggestion based on discussion in the thread, maybe squash in (to the final patch): diff --git a/Documentation/config/core.txt b/Documentation/config/core.txt index c86b8c8408..c5a8033df9 100644 --- a/Documentation/config/core.txt +++ b/Documentation/config/core.txt @@ -523,9 +523,8 @@ core.commentChar:: Commands such as `commit` and `tag` that let you edit messages consider a line that begins with this character commented, and removes them after the editor returns - (default '#'). Note that this option can take values larger than - a byte (whether a single multi-byte character, or you - could even go wild with a multi-character sequence). + (default '#'). Note that this variable can be a string like + `//` or `⁑⁕⁑`; it doesn't have to be a single ASCII character. + If set to "auto", `git-commit` would select a character that is not the beginning character of any line in existing commit messages. That's assuming we don't want to go the commentString route, which would require a bit more re-working of the patch. I'm also open to a more clever or pretty multi-byte example if we have one. ;) -Peff