On Thu, Mar 31 2022, Phillip Wood wrote: [tl;dr: Reply below, but this whole thing should be addressed by the v4 I sent last night: https://lore.kernel.org/git/cover-v4-0.6-00000000000-20220331T014349Z-avarab@xxxxxxxxx/ I.e. the controversial patch has been ejected]. > On 28/03/2022 13:16, Ævar Arnfjörð Bjarmason wrote: >> On Mon, Mar 28 2022, Johannes Sixt wrote: >> >>> Am 27.03.22 um 13:15 schrieb Ævar Arnfjörð Bjarmason: >>>> >>>> On Sun, Mar 27 2022, Johannes Sixt wrote: >>>> >>>>> Am 26.03.22 um 18:14 schrieb Ævar Arnfjörð Bjarmason: >>>>>> Partially revert d323c6b6410 (i18n: git-sh-setup.sh: mark strings for >>>>>> translation, 2016-06-17). >>>>>> >>>>>> These strings are no longer used in-tree, and we shouldn't be wasting >>>>>> translator time on them for the benefit of a hypothetical out-of-tree >>>>>> user of git-sh-setup.sh. > > The out of tree users of git-sh-setup.sh are not hypothetical, they > exist and objected when you recently tried to remove these functions > entirely[1]. I see that what I wrote there is ambiguous, but I'm aware of that & remember that thread. I meant to say the hypothetical user that cares about the i18n these functions exposed. >>>>> There is public documentation for these functions. Hence, you must >>>>> assume that they are used in scripts outside of Git. Castrating their >>>>> functionality like this patch does is unacceptable. >>>> >>>> For require_clean_work_tree() the public documentation for this function >>>> states that it will emit a specific error message in English, and you're >>>> expected to Lego-interpolate a string that we'll concatenate with it: > > The documentation does not say whether the message is translated or > not, probably because it was not updated when the translations were > added six years ago. It does say it. It uses the word "Cannot" at the beginning, and promises to emit that specific string. Yes we didn't update it at the time for i18n, and probably should. But to the extent that the gordian knot in making any changes to these whatsoever is because they've been publicly documented I don't think anyone using these has been promised different behavior. So it's highly relevant here. >>>> [...]It emits an error message of the form `Cannot >>>> <action>: <reason>. <hint>`, and dies. Example: > > This is not a promising a "specific error message in English" It really is. You cannot use this API to produce sensible output in any other language. It was used like this: require_clean_work_tree "pull with rebase" "Please commit or stash them." For which we'd emit: Cannot pull with rebase: You have unstaged changes. Please commit or stash them. You can see e.g. in the Bulgarian translation that this was dealt with by putting the interpolated string in double-quotes. >>>> + >>>> ---------------- >>>> require_clean_work_tree rebase "Please commit or stash them." > > This is an example message you cannot use that to argue that we will > always show a message in English I'm saying that the documentation says it emits English, that it didn't always do that, and now does so again. And that to get it to emit anything sensible in cases where we're not under LC_ALL=C would have required 1=1 matching the behavior of whatever shellscript is using this to what git-sh-i18n in picking the locale. I don't think it's plausible that there's an out-of-tree user maintaining their own set of i18n'd po/ files which expect to interact with our translations in this way. Any out-of-tree user of this (if they're using this at all) will either not care, or they'll see more sensible output again. >>>> So I think that marking it for translation like this as d323c6b6410 was >>>> always broken in that it broke that documented promise. >>> >>> I can buy this argument. But then this must be a separate commit with >>> this justification. >> Sure, I can elaborate on that point & split it up. >> >>>> But that's just an argument for keeping the require_clean_work_tree() >>>> part of this patch, not require_work_tree_exists(). >>>> >>>> For that one and others in git-sh-setup we've never said that we'd >>>> provide these translated (and to the extent we've implied anything about >>>> the rest, have strongly implied the opposite with >>>> require_clean_work_tree()'s docs). >>>> >>>> Nothing will break for out-of-tree users as a result of this >>>> change. > > The strings the user sees will change Yes, and I'll admit that "nothing will break here" on my part isn't the same as saying "there will be no observable change whatsoever". Sorry about being unclear there. As a general matter we don't promise that such strings won't change, even for die(), error() etc. messages emitted by plumbing commands. Except in some rare cases where they've been known to be used out of tree extensively, e.g. the human-readable "merge" messages where we have/had no other API to expose the same information. Or, in the case of plumbing output where such strings are part of the API contract. But for these commands in the "Internal helper commands" category I think this fall squarely in the category of changing a random error(), die() etc. in the C code (which we do quite freely). >>>> When we added these functions and their documentation their >>>> output wouldn't be translated, > > Where does the documentation say "the output will not be translated"? I think this was covered above, it's sufficient that it didn't promise that it would be, and in the one case where we discuss it in passing with an example we imply that it won't be. >>>> then sometimes it was, now it's not >>>> again. >>> >>> This does not sound convincing at all, but rather like "I want the code >>> to be so, and here is some handwaving to justify it". What is wrong with >>> the status quo? >> The larger context for why I was looking at this again is that I'm >> trying to slowly get us to the point where we can remove the >> i18n-in-shellscript entirtely. >> But I thought that was a rather large digression for the commit >> message, >> and that these being both unused, and not something the "public" API >> affected ever promised it would do was sufficient. > > I think if that is what you want to do then you should propose a > series that does just that and explains why it is desirable, rather > than coming up with other reasons to justify the changes you want. Just because I start looking at some code for reason X that doesn't mean that submitting a patch with rationale Y isn't a sufficient reason to make that change. I still think that in this case that they're not used by our own i18n effort is a perfectly sufficient reason to make the change, as we won't waste translator time in it. I.e. I'll still stand behind the stated rationale. But aside from that most changes I made to git are with an eye to some larger semi-related goal. I do have some WIP changes to tear down most of the *.sh and *.perl i18n infrastructure (the parts still in use would still have translations), and IIRC it's at least a 2k line negative diffstat, and enables us to do more interesting things in i18n (e.g. getting rid of the libintl dependency). But I also don't think that such a series is probably not possible in the near term if we're going to insist that all shellscript output must byte-for-byte be the same (for boring reasons I won't go into, but it's mainly to do with sh-i18n--envsubst.c). So it's also a bit of a chicke & egg problem. I wanted to send any such UI changes in first, to see if it was even worth finishing up that work, or if the whole thing would stall on not being able to change some output someone somewhere might have relied on being byte-for-byte the same. >>>> We need also need to be mindful of translator time, it's a *lot* of >>>> strings to go through, and e.g. I've commented in the past on patches >>>> that marked stuff in t/helper/ for translation. >>> >>> Translator's time is your concern? No translator sifts through 5000 >>> strings on every release. There are tools that show only new strings to >>> translate. >> Yes, I'm the person who added this entire i18n infrastructure to >> git, I >> know how it works :) >> >>> A text is translated once and then it lies under the radar >>> until someone changes it. Don't tell me that is time-consuming. >> Yes, individual orphaned strings aren't, but they add up. >> Just like having that "USE_PIC" comment in configure.ac isn't much >> of a >> big deal, but it makes sense to clean up unused code, just as we're >> adding new code. >> I will say that your implicit proposal of keeping this forever >> instead >> is assuming that we won't have more translations for git, and every new >> translator will look at this. >> Context is critical for translators, so even if it's one string it's >> a >> string you'll quickly grep for and find ... no uses for, and then likely >> go hunting around for where it's used only to (hopefully, in that case) >> find this thread. Better not to have it. >> >>> On the other hand, there is a lot of *reviewer* time that you are >>> spending with changes like this. *That* should be your concern. >> I'd think most of the that time, if any, will be spent on this >> sub-thread you started, so ... :) > > This sub-tread exists because you posted this patch to the mailing > list. Blaming reviewers for asking perfectly reasonable questions is > neither fair nor helpful. I didn't mean any offense there, but did mean to suggest (smiley an all) that a mountain was being made out a molehill in this case. Yes translator time is my concern. I started the i18n effort in git, and I think it's really important. We currently have 18 translations of git in the po/ directory, 16 if you leave out "dialects". Which if you compare it with https://en.wikipedia.org/wiki/List_of_languages_by_number_of_native_speakers is quite bad. For comparison I worked extensively on MediaWiki in a past life, which at the time had at least 100 such translations. I looked again and it's up to around 600 (many incomplete, to be fair). Is that our fault as project? No, but we could definitely help it along. I value the scarcity of translator time (including future translations) much more than concerns that there *may be* someone somewhere who's got a reliance on this particular output. > This patch does not remove dead code as the rest of the series does > but instead changes user facing messages in code that we recently > established is part of the public api[2]. Nothing has changed since > that recent discussion so I'm confused as to why you are proposing to > modify the api again so soon. As noted above I don't think that previous discussion applies to these changes as you describe, but in any case, ~8 hours before you sent this reply I sent a v4 re-roll which left out this change: https://lore.kernel.org/git/cover-v4-0.6-00000000000-20220331T014349Z-avarab@xxxxxxxxx/ Which I hope will address your & Johannes Sixt's concerns here. Does the rest of this series look good to you?