Search Postgresql Archives

Re: Smaller data types use same disk space

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 07/24/2012 03:21 PM, McGehee, Robert wrote:
Hi,
I've created two tables labeled "Big" and "Small" that both store the same 10 million rows of data using 493MB and 487MB of disk space respectively. The difference is that the "Big" table uses data types that take up more space (integer rather than smallint, float rather than real, etc). The "Big" table should need about 27 bytes/row versus 16 bytes/row for the "Small" table, indicating to me that the "Big" table should be 70% bigger in actual disk size. In reality, it's only 1% bigger or 6MB (after clustering, vacuuming and analyzing). Why is this? Shouldn't the "Small" table be about 110MB smaller (11 bytes for 10 million rows)? I'm estimating table size with \d+

Thanks, Robert

          Table "Big"
   Column  |       Type       | Bytes
----------+------------------+-----------
  rmid     | integer          | 4
  date     | date             | 4
  rmfactor | text             | 7 (about 3 characters/cell)
  id       | integer          | 4
  value    | double precision | 8
---------------------------------
  Total Bytes/Row               27
  Rows                          10M
  Actual Size                   493MB


     Table "Small"
  Column |   Type   | Bytes
--------+----------+-----------
  rmid   | smallint | 2
  date   | date     | 4
  rmfid  | smallint | 2 (rmfid is a smallint index into the rmfactor table)
  id     | integer  | 4
  value  | real     | 4
---------------------------------
  Total Bytes/Row     16
  Rows                10M
  Actual Size         487MB


More questions than answers:

What version of PostgreSQL?

How are your determining the space used by a table?

Why are you assuming 7 bytes for a 3-character value? (Character values up to 126 characters long only have 1-character overhead.)

What is the fill-factor on the tables? (Should default to 100% but don't know how you are configured.)

Do the tables have OIDs or not?

Other considerations are that rows don't split across pages so there is a bit of waste per page. Also there could be compression considerations though I'm not sure that small rows like this will be compressed.

Cheers,
Steve



--
Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Postgresql Jobs]     [Postgresql Admin]     [Postgresql Performance]     [Linux Clusters]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Postgresql & PHP]     [Yosemite]
  Powered by Linux