Re: 回复: May "PostgreSQL server side GB18030 character set support" reconsidered?

Tom Lane <tgl@xxxxxxxxxxxxx> · Mon, 05 Oct 2020 20:58:42 -0400

Tatsuo Ishii <ishii@xxxxxxxxxxxx> writes:
> One of ideas to avoid the concern could be "shifting" GB18030 code
> points into "ASCII safe" code range with some calculations so that
> backend can handle them without worrying about the concern above. This
> way, we could avoid a table lookup overhead which is necessary in
> conversion between GB18030 and UTF8 and so on.

Hmm ... interesting idea, basically invent our own modified version
of GB18030 (or SJIS?) for backend-internal storage.  But I'm not
sure how to make it work without enlarging the string, which'd defeat
the OP's argument.  It looks to me like the second-byte code space is
already pretty full in both encodings.

			regards, tom lane