Tom Lane wrote:
Because the length specification is in *characters*, which is not by any means the same as *bytes*. We could possibly put enough intelligence into the low-level tuple manipulation routines to count characters in whatever encoding we happen to be using, but it's a lot faster and more robust to insist on a count word for every variable-width field.
I guess what you're saying is that PostgreSQL stores characters in varying-length encodings. If it stored character data in Unicode (UCS-16) it would always take up two-bytes per character. Have you considered supporting NCHAR/NVARCHAR, aka NATIONAL character data? Wouldn't UCS-16 be needed to support multi-locale clusters (as someone as inquiring about recently)?
Joe ---------------------------(end of broadcast)--------------------------- TIP 4: Have you searched our list archives? http://archives.postgresql.org