On 07/24/2012 03:21 PM, McGehee, Robert wrote:
Hi,
I've created two tables labeled "Big" and "Small" that both store the same 10 million rows of data using 493MB and 487MB of disk space respectively. The difference is that the "Big" table uses data types that take up more space (integer rather than smallint, float rather than real, etc). The "Big" table should need about 27 bytes/row versus 16 bytes/row for the "Small" table, indicating to me that the "Big" table should be 70% bigger in actual disk size. In reality, it's only 1% bigger or 6MB (after clustering, vacuuming and analyzing). Why is this? Shouldn't the "Small" table be about 110MB smaller (11 bytes for 10 million rows)? I'm estimating table size with \d+
Thanks, Robert
Table "Big"
Column | Type | Bytes
----------+------------------+-----------
rmid | integer | 4
date | date | 4
rmfactor | text | 7 (about 3 characters/cell)
id | integer | 4
value | double precision | 8
---------------------------------
Total Bytes/Row 27
Rows 10M
Actual Size 493MB
Table "Small"
Column | Type | Bytes
--------+----------+-----------
rmid | smallint | 2
date | date | 4
rmfid | smallint | 2 (rmfid is a smallint index into the rmfactor table)
id | integer | 4
value | real | 4
---------------------------------
Total Bytes/Row 16
Rows 10M
Actual Size 487MB
See here for the gory details:
http://www.postgresql.org/docs/9.1/interactive/storage-page-layout.html
See in particular:
Table 55-4. HeapTupleHeaderData Layout
From the text:
""All table rows are structured in the same way. There is a fixed-size
header (occupying 23 bytes on most machines.." which breaks you
assumption of the Big/Small row size comparison.
--
Adrian Klaver
adrian.klaver@xxxxxxxxx
--
Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general