Re: *Really* noisy encoding warnings post-v2.33.0

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Oct 29, 2021 at 12:47:36PM +0200, Ævar Arnfjörð Bjarmason wrote:

> > The other issue is that it is assuming UTF-8 on one end of the
> > conversion. But we aren't necessarily doing such a conversion; it
> > depends on the commit's on-disk encoding, and the requested output
> > encoding. In particular:
> >
> >   - if both of those match, we do not need to call iconv at all (see the
> >     same_encoding() check in repo_logmsg_reencode()). With the patch
> >     above, the NO_ICONV case would start to die() when both are say
> >     iso8859-1, even though it currently works.
> >
> >   - likewise, even if you have iconv support, it's possible that your
> >     preferred encoding is not compatible with utf8. In which case
> >     iconv_open() may complain, even though the actual conversion we'd
> >     ask it to do would succeed.
> >
> > I.e., I don't think there's a way to just ask iconv "does this encoding
> > name by itself make any sense". You can only ask it about to/from
> > combos.
> 
> Yes, I'm not saying it covers the general problem, but that it covers
> the specific complained-about issue of a completely nonsensical encoding
> like "HTML". We should simply error on that on command startup, whether
> or not we have any commits to visit.

I definitely agree with you on the direction, and I don't mind if we
don't cover every case. What I was trying to point out above though is
that the patch you showed actually _regresses_ some cases, and it's hard
to robustly avoid that.

> So per <87ily7m1mv.fsf@xxxxxxxxxxxxxxxxxxx> why can't we just revert the
> warning(), and then consider a good way forward that covers some/all of
> these cases we've noted?

Right, I agreed with that in the other thread. You may need to convince
Junio. ;)

TBH I am not even sure it is worth spending a lot of brain cells on the
"and then consider..." part. Over all these years, we've had one report,
and it simply misunderstand what "--encoding" was for. I thought it was
something we could fix up easily by checking a return value, but IMHO
doing it right is quite tricky because of iconv()'s limited interface,
and the risk of regression outweighs the potential benefit.

-Peff



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux