Search Postgresql Archives

Re: Tweaking bytea / large object block sizes?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 13/06/11 09:27, Merlin Moncure wrote:

> want to use the binary protocol mode (especially for postgres versions
> that don't support hex mode)

Allowing myself to get a wee bit sidetracked:

I've been wondering lately why hex was chosen as the new input/output
format when the bytea_output change went in. The Base64 encoding is
trivial to implement, already supported by standard libraries for many
languages and add-ons for the rest, fast to encode/decode, and much more
compact than a hex encoding, so it seems like a more attractive option.
PostgreSQL already supports base64 in explicit 'escape()' calls.

Was concern about input format ambiguity a motivator for avoiding
base64? Checking the archives:

http://archives.postgresql.org/pgsql-hackers/2009-05/msg00238.php
http://archives.postgresql.org/pgsql-hackers/2009-05/msg00192.php

... it was considered but knocked back because it's enough more complex
to encode that it could matter on big dumps and standards-compliant
base64 appears to require newlines - something that was viewed as ugly
and problematic. Initial input format detection reliability options were
also raised, but as the same solution used for hex input would apply to
base64 input too it doesn't look like that was a big factor.

Personally, even with the newline 'ick factor' I think it'd be pretty
nice to have as an option for dumps and COPY.

Ascii85 (base85) would be another alternative. It's used in PostScript
and PDF, but isn't anywhere near as widespread as base64. It's still
trivial to implement and is 7-8% more space-efficient than base64.

After a bit of digging, though, I can't help wonder if a binary dump
format that's machine-representation independent, fast and compact isn't
more practical. Tools like Thrift (http://thrift.apache.org), Protocol
Buffers, etc might make it less painful. Maybe an interesting GsOC
project? Supporting binary COPY with a machine independent format would
be a natural extension of that, too.

--
Craig Ringer

-- 
Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Postgresql Jobs]     [Postgresql Admin]     [Postgresql Performance]     [Linux Clusters]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Postgresql & PHP]     [Yosemite]
  Powered by Linux