Re: pre-v2.34.0-rc0 regressions: 'git log' has a noisy iconv() warning

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Oct 27, 2021 at 11:04:50AM -0700, Junio C Hamano wrote:

> Jeff King <peff@xxxxxxxx> writes:
> 
> > The firehose of warnings for "git log --encoding=nonsense" was known and
> > discussed in fd680bc558 (logmsg_reencode(): warn when iconv() fails,
> > 2021-08-27). It's ugly for sure, but I'm still OK with it for the
> > reasoning there: your next step is to fix the --encoding argument you
> > gave. Whether you saw one line of warning or many is not that important,
> > IMHO. Giving a single more sensible warning ("your encoding 'nonsense'
> > isn't valid") would be better, but I think it's hard to do without
> > creating other problems.
> >
> > But the most compelling argument against warning at all is the case you
> > gave earlier: that there may be historical garbage commits, and you
> > can't get rid of them, so being warned constantly that we're not going
> > to show or grep them correctly is just annoying. And that is true
> > whether the user sees one warning or a hundred.
> 
> Is it really a "firehose"?  I won't use the word for one warning
> message per commit in the output of "git log --encoding=nonsense".
> 
> If you are running "git log --oneline", it may indeed be annoying to
> double the number of lines shown, and indeed
> 
>     $ git log --oneline --encoding=US-ASCII -4 ab/doc-lint
>     warning: unable to reencode commit to 'US-ASCII'
>     414abf159f docs: fix linting issues due to incorrect relative section order
>     warning: unable to reencode commit to 'US-ASCII'
>     ea8b9271b1 doc lint: lint relative section order
>     warning: unable to reencode commit to 'US-ASCII'
>     cafd9828e8 doc lint: lint and fix missing "GIT" end sections
>     warning: unable to reencode commit to 'US-ASCII'
>     d2c9908076 doc lint: fix bugs in, simplify and improve lint script

It's a bit more than that. You get similar warnings for commits which we
--grep but don't show (and which _might_ have been shown if the encoding
conversion had been different). Try "git log --grep=foo --encoding=bar".

And of course the interleaved output you see above looks OK in a pager.
But if you're sending the output of log (or diff-tree, etc) elsewhere,
you're just going to get a stream of messages on stderr. That would be a
bit less egregious if the message mentioned the commit oid, so they
weren't strict duplicates.

> is indeed annoying, as everything that is _shown_ ought to be
> presentable in US-ASCII.  This observation makes us realize an
> obvious approach to improve over the current behaviour without
> losing the warning when it matters, but I think the required code
> change, to first split the commit message into pieces (which roughly
> corresponds to the atoms in the --format= placeholder language) and
> reencode only these pieces that will be shown, may be too involved
> to be worth the effort.

Yeah, I think that would complicate things significantly, with the way
the code is currently structured. It also means parsing commits that are
in arbitrary encodings, which is not possible in most general sense.
E.g., imagine an encoding which doesn't have ASCII as subset, like
UTF-16.  Though I suspect such encodings probably do not work for
commits anyway (there is a chicken-and-egg with reading the encoding
header in the first place).

> > So while I do hate to have Git just silently ignore errors, probably the
> > original behavior is the least-bad thing, and we should just revert
> > fd680bc558 (logmsg_reencode(): warn when iconv() fails, 2021-08-27). We
> > probably want to salvage the documentation change (minus the "along with
> > a warning") part.
> 
> I am all for making it convenient to squelch, but it would be sad to
> lose the convenient way to notice possible misencoding in recent
> commits.  Or can we have a command line option and pass it through
> the callchain, or would that be too involved?

Do you mean a command-line option to squelch the warnings? I think it
would not be too hard to do it as a config option (which is probably
what you'd want anyway, since historical commits would come up over and
over again).

-Peff



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux