Re: [PATCH virt-viewer] Don't set LC_ALL=C during build as that breaks python apps

"Daniel P. Berrange" <berrange@xxxxxxxxxx> · Fri, 28 Jul 2017 09:53:20 +0100

On Thu, Jul 27, 2017 at 04:00:53PM +0100, Richard W.M. Jones wrote:
> On Tue, Jul 25, 2017 at 01:48:41PM +0100, Daniel P. Berrange wrote:
> > Setting LC_ALL=C breaks python apps doing I/O on UTF-8 source
> > files. In particular this broke glib-mkenums
> > 
> >     Traceback (most recent call last):
> >       File "/usr/bin/glib-mkenums", line 669, in <module>
> >         process_file(fname)
> >       File "/usr/bin/glib-mkenums", line 406, in process_file
> >         line = curfile.readline()
> >       File "/usr/lib64/python3.6/encodings/ascii.py", line 26, in decode
> >         return codecs.ascii_decode(input, self.errors)[0]
> >     UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 849: ordinal not in range(128)
> > 
> > Signed-off-by: Daniel P. Berrange <berrange@xxxxxxxxxx>
> > ---
> > 
> > Pushed to fix rawhide build
> > 
> >  maint.mk | 4 ++--
> >  1 file changed, 2 insertions(+), 2 deletions(-)
> > 
> > diff --git a/maint.mk b/maint.mk
> > index 79104d0..2e70cae 100644
> > --- a/maint.mk
> > +++ b/maint.mk
> > @@ -117,8 +117,8 @@ news-check-lines-spec ?= 1,10
> >  news-check-regexp ?= '^\*.* $(VERSION_REGEXP) \($(today)\)'
> >  
> >  # Prevent programs like 'sort' from considering distinct strings to be equal.
> > -# Doing it here saves us from having to set LC_ALL elsewhere in this file.
> > -export LC_ALL = C
> > +# Doing it here saves us from having to set LC_COLLATE elsewhere in this file.
> > +export LC_COLLATE = C
> 
> I don't know what the answer is, but two observations:
> 
> (1) We had the same problem in libguestfs and this was our fix:
> 
> https://github.com/libguestfs/libguestfs/commit/f861c138550a0c99247a6955aa2c594f380867f4

Hmm, unsetting LC_ALL means the output of the script is potentially affected
by differing sort ordering of the user's locale, which is why i kept setting
LC_COLLATE.

It seems that a better approach would be to use C.UTF-8, and then fallback
to en_US.UTF-8 on systems which lack it, since en_US is still pretty close
to C in its semantics, while supporting UTF-8 everywhere. eg change maint.mk
to be

 export LC_ALL = $(shell LC_ALL=C.utf-8 locale -ck charmap 2>/dev/null | \
                         grep -i UTF-8 1>/dev/null 2>&1 && \
                         echo "C.UTF-8"  || echo "en_US.UTF-8")

Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|

_______________________________________________
virt-tools-list mailing list
virt-tools-list@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/virt-tools-list