On Fri, Mar 9, 2012 at 10:19 AM, Andy Colson <andy@xxxxxxxxxxxxxxx> wrote: > On 3/9/2012 9:47 AM, Merlin Moncure wrote: >> >> On Thu, Mar 8, 2012 at 2:01 PM, Andy Colson<andy@xxxxxxxxxxxxxxx> wrote: >>> >>> I know toast compresses, but I believe its only one row. page level >>> would >>> compress better because there is more data, and it would also decrease >>> the >>> amount of IO, so it might speed up disk access. >> >> >> er, but when data is toasted it's spanning pages. page level >> compression is a super complicated problem. >> >> something that is maybe more attainable on the compression side of >> things is a userland api for compression -- like pgcrypto is for >> encryption. even if it didn't make it into core, it could live on >> reasonably as a pgfoundry project. >> >> merlin > > > Agreed its probably too difficult for a GSoC project. But userland api > would still be row level, which, in my opinion is useless. Consider rows > from my apache log that I'm dumping to database: It's useless for what you're trying to do, but it would be useful to people trying to compress large datums (data, I know) before storage using algorithms that postgres can't support, like lzo. > date, url, status > 2012-3-9 10:15:00, '/index.php?id=4', 202 > 2012-3-9 10:15:01, '/index.php?id=5', 202 > 2012-3-9 10:15:02, '/index.php?id=6', 202 > > That wont compress at all on a row level. But it'll compress 99% on a > "larger" (page/multirow/whatever/?) level. sure, but you can only get those rates by giving up the segmented view of the data that postgres requires. your tuples are very small and I only see compression happening on the userland side by employing tricks specific to your specific dataset (like employing "char" to map the status, url mapping, etc). merlin -- Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general