On Thu, Sep 05, 2019 at 07:08:27PM +0200, Fabiano Fidêncio wrote: > On Wed, Mar 27, 2019 at 10:57 AM Daniel P. Berrangé <berrange@xxxxxxxxxx> wrote: > > > > The python3 ascii codec violates POSIX C locale requirements by not being > > 8-bit clean in its text handling. It raises an error for any byte with > > top bit set > > > > > return codecs.ascii_decode(input, self.errors)[0] > > E UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 419: ordinal not in range(128) > > > > To avoid this python bug we must force use of a UTF-8 locale. Ideally we > > would use the C.UTF-8 locale, however, that is not portable across OS, > > only existing on certain Linux distros. Instead we use the en_us.UTF-8 > > locale, but only for the character set data. > > > > Signed-off-by: Daniel P. Berrangé <berrange@xxxxxxxxxx> > > --- > > > > Pushed as a CI build fix for FreeBSD distros > > > > Makefile | 2 +- > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > diff --git a/Makefile b/Makefile > > index c63cb6e..9d7f109 100644 > > --- a/Makefile > > +++ b/Makefile > > @@ -123,4 +123,4 @@ update-po: > > done > > > > check: $(DATA_FILES) $(SCHEMA_FILES) > > - $(PYTHON) -m pytest $(PYTEST_LOG_LEVEL) > > + LC_ALL= LANG=C LC_CTYPE=en_US.UTF-8 $(PYTHON) -m pytest $(PYTEST_LOG_LEVEL) > > -- > > 2.20.1 > > > > _______________________________________________ > > Libosinfo mailing list > > Libosinfo@xxxxxxxxxx > > https://www.redhat.com/mailman/listinfo/libosinfo > > Daniel, > > This commit is the reason of the following breakage (in my personal > gitlab account): > https://gitlab.com/fidencio/osinfo-db/-/jobs/288707257 > > It seems to happen because both debian & fedora (30+) containers do > not have the required locale. > I'd like to ask your suggestion on how to proceed here: > - Shall we explicitly include glibc-langpack-en as part of the base packages? > - Its dependencies are: glibc, glibc-commonl > - Its size is: 6.0 M (on Fedora 30); > - Shall we work around osinfo-db tests in a way that we can make it > work without setting the locale? In theory C.UTF-8 is our desired locale, but that is a non-standard concept that is only carried as a downstream patch by certain distros. Upstream glibc has not accepted it. It doesn't exist at all on *BSD. Thus we picked en_US.UTF-8 as the only option that gives us UTF-8 which is portable across all known operating systems. If you can't set the locale, the only option is to mandate python 3.7 as the minimum python version, which I think is too strict. IOW, we shoud just intall the langpack. FWIW, I'm proposing the exact same en_US.UTF-8 env var for libvirt python code, so we'll need to deal with the same problem shortly there too. Regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :| _______________________________________________ Libosinfo mailing list Libosinfo@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/libosinfo