On Wed, Jul 13, 2005 at 10:48:56AM +0200, Dawid Kuroczko wrote: > As for the char/varchar type -- I was wondering. Worst case > scenario for UTF-8 (correct me on this) is when 1 character > takes 4 bytes. And biggest problem with char/varchar is that > length indicator takes 4 bytes... How much overhead would > it be to make a length variable, for example: > > (var)char(1)-char(63) -- 1 byte length + string > char(64)-char(16383) -- 2 byte length + string > char(16384)-text -- 4 byte length + string, like now Well, you get another issue, alignment. If you squeeze your string down, the next field, if it is an int or string, will get padded to a multiple of 4 negating most of the gains. Like in C structures, there is padding to optimise access. > This would reduce length of char(5) string from 9 bytes to > 6 bytes, char(2) from 6 bytes to 3 bytes (for multibyte chars > it would be a win also). The only types that won't require padding if they are next field are bool, "char" and cstring. So char(1-4) will actually go from 8 to 4 bytes in most cases. char(5-8) will go from 12 to 8 bytes. > I don't know the internals too well (read: at all), but I guess there > would be a problem of choosing which length of length to use -- > would it be possible to make some sort of on-the-fly mapping > when creating tables -- varchar(224) is text_2bytelength, > text is text_4bytelength, char(1) is text_1bytelength... At the moment there are two basic types: variable length and fixed length as defined by the type id. Fixed length are stored as is. Variable length is a 4 byte length plus the string. The two highest bits are flags. AFAIK it's the typlen value that decides the decoding, the type itself is irrelevent. I suppose somebody could create a new encoding but I don't know how hard that would be... Have a nice day, -- Martijn van Oosterhout <kleptog@xxxxxxxxx> http://svana.org/kleptog/ > Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a > tool for doing 5% of the work and then sitting around waiting for someone > else to do the other 95% so you can sue them.
Attachment:
pgpVDqNWhmbP8.pgp
Description: PGP signature