Re: Sphinx parallel build error: UnicodeEncodeError: 'latin-1' codec can't encode characters in position 18-20: ordinal not in range(256)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Am 06.05.21 um 19:48 schrieb Michal Suchánek:
On Thu, May 06, 2021 at 07:04:44PM +0200, Markus Heiser wrote:
Am 06.05.21 um 18:46 schrieb Mauro Carvalho Chehab:
Em Thu, 6 May 2021 17:57:15 +0200
Markus Heiser <markus.heiser@xxxxxxxxxxx> escreveu:

Am 06.05.21 um 12:39 schrieb Michal Suchánek:
When building HTML documentation I get this output:
...
[  412s] UnicodeEncodeError: 'latin-1' codec can't encode characters in position 18-20: ordinal not in range(256)
...

It does not say which input file contains the offending character so I can't tell which file is broken.

Any idea how to debug?

I guess the build host is a very simple container, what does

     echo $LC_ALL
     echo $LANG
It's actually set to en_US just before the build.

prompt?  If it is latin, change it to something using utf-8 (I recommend
'en_US.utf8').

A UnicodeEncodeError can occour everywhere where characters are
encoded from (internal) unicode to the encoding of the stream.

By example:

A print or log statement which streams to stdout needs to encode
from unicode to stdout's encoding.  If there is one unicode symbol
which can not encoded to stream's encoding a UnicodeEncodeError
is raised.

Hi Markus,

It shouldn't matter the builder's locale when building the Kernel
documentation (or any other documents built from other git trees
on other open source projects), as the Kernel's *.rpm document charset
won't change, no matter on what part of the globe it was built.

I vaguely remember about a change we made a couple of years ago
in order to address this issue.

Hi Mauro :)

sure? .. what if the logger wants to log some symbols from the
chines translated parts to stdout and the encoding of stdout is
latin?

[  127s] + cd linux-5.12-next-20210506
[  127s] + export LANG=en_US
[  127s] + LANG=en_US
[  127s] + mkdir -p html
[  127s] + python3 -c 'print("↑ᛏ个")'
[  127s] ↑ᛏ个
[  127s] + echo 'print("↑ᛏ个")'
[  127s] + python3 test.py
[  127s] Traceback (most recent call last):
[  127s]   File "test.py", line 1, in <module>
[  127s]     print("\u2191\u16cf\u4e2a\uf8f9")
[  127s] UnicodeEncodeError: 'latin-1' codec can't encode characters in
position 0-3: ordinal not in range(256)

It certainly does not look like python can print unicode in this
environment. It tells me where the problem is, though.

Can't speak for the image of your container, may you need to install
some utf-8 packages / but in most cases

  export LANG=en_US.UTF-8
  export LC_ALL=en_US.UTF-8

should help.

  -- Markus --


Thanks

Michal

[  127s] + :
[  127s] + locale
[  128s] LANG=en_US
[  128s] LC_CTYPE="en_US"
[  128s] LC_NUMERIC="en_US"
[  128s] LC_TIME="en_US"
[  128s] LC_COLLATE="en_US"
[  128s] LC_MONETARY="en_US"
[  128s] LC_MESSAGES="en_US"
[  128s] LC_PAPER="en_US"
[  128s] LC_NAME="en_US"
[  128s] LC_ADDRESS="en_US"
[  128s] LC_TELEPHONE="en_US"
[  128s] LC_MEASUREMENT="en_US"
[  128s] LC_IDENTIFICATION="en_US"
[  128s] LC_ALL=
[  128s] + echo LC_ALL=
[  128s] LC_ALL=
[  128s] + echo LANG=en_US
[  128s] LANG=en_US




[Index of Archives]     [Kernel Newbies]     [Security]     [Netfilter]     [Bugtraq]     [Linux FS]     [Yosemite Forum]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Samba]     [Video 4 Linux]     [Device Mapper]     [Linux Resources]

  Powered by Linux