Re: Converting MySQL tinyint to PostgreSQL

Alvaro Herrera <alvherre@xxxxxxxxxxxxxx> · Tue, 12 Jul 2005 18:10:48 -0400

On Tue, Jul 12, 2005 at 05:37:32PM -0400, Joe wrote:
> Tom Lane wrote:
> >Because the length specification is in *characters*, which is not by any
> >means the same as *bytes*.
> >
> >We could possibly put enough intelligence into the low-level tuple
> >manipulation routines to count characters in whatever encoding we happen
> >to be using, but it's a lot faster and more robust to insist on a count
> >word for every variable-width field.
> 
> I guess what you're saying is that PostgreSQL stores characters in 
> varying-length encodings.

It _may_ store characters in variable length encodings.  It can use
fixed-length encodings too, such as latin1 or plain ASCII (actually,
unchecked 8 bits, which means about anything) -- you define that at
initdb time or database creation time, I forget.  It would be painful
for the code to distinguish fixed-length from variable-length at
runtime, an optimization that would allow getting rid of the otherwise
required length word.  So far, nobody has cared enough about it to do
the job.

> If it stored character data in Unicode (UCS-16) it would always take
> up two-bytes per character.

Really?  We don't support UCS-16, for good reasons (we'd have to rewrite
several parts of the code in order to support '0' bytes embedded in
strings ... we use regular C strings extensively).

However we do support Unicode as UTF-8, but it's been said a couple of
times that characters can be wider than 2 or 3 bytes in some cases.  So,
I don't see how UCS-16 could always use only 2 bytes.

> Have you considered supporting NCHAR/NVARCHAR, aka NATIONAL character
> data?

There have been noises, but so far nobody has stepped up the plate to do
the work.

-- 
Alvaro Herrera (<alvherre[a]alvh.no-ip.org>)
"Those who use electric razors are infidels destined to burn in hell while
we drink from rivers of beer, download free vids and mingle with naked
well shaved babes." (http://slashdot.org/comments.pl?sid=44793&cid=4647152)

---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
       choose an index scan if your joining column's datatypes do not
       match