Search Postgresql Archives

strange table disk sizes

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




Hello,

I have two large tables in a database, one containing original data and the other one derived from the first table. The first table contains several columns and indexes, while the second table has less columns and only one index. Both tables have the same number of rows. Nevertheless, the second table is much larger in disk size than the first one. How can this be explained?

The tables are defined below. The first table 'measurement' contains information about measurements by different devices on different times. The second table 'stats_count' is created in order to be able to quickly query the number of measurements for a particular device in an arbitrary time interval. To accomplish this the second table contains for each measurement a cumulative count of the number of measurements up until that time. The table is kept synchronised with the first table with triggers.

The first table contains 4 integer columns, one enum column and one time column, which would normally take 28 bytes for each row. The table contains about 50 million rows. This means about 1.5 gigabyte of data. The actual disk size of this table (without indexes) is 3 gigabyte, which is propably explainable because of some overhead. The disk size with indexes is about 9 gigabyte.

The second table on the other hand only contains 3 integer columns and one time column. Nevertheless the size on disk of this table is about 14 gigabyte without indexes. There is one index on this table which uses an extra 4 gigabyte. Can anyone explain this?

One possible reason is that the primary key on table measurement is an integer, while the primary key on table stats_count is a combination of two integers and a time. Logically it requires more disk space to store the second key, however the same two integers and time also form an index in the first table, so should take the same amount of disk space on the first table. Or does a primary key take more disk space than another index or unique constraint? Would it therefore be more efficient (=less disk space) to use an extra id column in the table stats_count as primary key and the combination (deviceid, type, time) as a regular index? And if so, will a query be still as fast (I can imagine that if an index uses less disk space, it will not be as fast).

Another possible reason could be the foreign key that is present in the table stats_count that references the combination (deviceid, type, time) in table measurements. Does this foreign key take that much disk space?

Any help would be greatly appreciated,
Kind regards,
Rik


CREATE TABLE measurement
(
  id serial NOT NULL,
  deviceid integer NOT NULL,
  "time" timestamp without time zone NOT NULL,
  javaobjectid integer NOT NULL,
  javaobjectsubid integer NOT NULL,
  "type" measurementtype NOT NULL,
  CONSTRAINT measurement_pkey PRIMARY KEY (id),
  CONSTRAINT measurement_deviceid_fkey FOREIGN KEY (deviceid)
      REFERENCES device (id) MATCH SIMPLE
      ON UPDATE NO ACTION ON DELETE NO ACTION,
CONSTRAINT measurement_javaobject_unique UNIQUE (javaobjectid, javaobjectsubid),
  CONSTRAINT measurement_unique UNIQUE (type, deviceid, "time")
)

CREATE INDEX measurement_javaobjectsubid_index
  ON measurement
  USING btree
  (javaobjectsubid);

CREATE TABLE stats_count
(
  deviceid integer NOT NULL,
  "time" timestamp without time zone NOT NULL,
  "type" measurementtype NOT NULL,
  cum_count integer,
  CONSTRAINT stats_count_pkey PRIMARY KEY (type, deviceid, "time"),
  CONSTRAINT stats_count_type_fkey FOREIGN KEY ("type", deviceid, "time")
      REFERENCES measurement ("type", deviceid, "time") MATCH SIMPLE
      ON UPDATE NO ACTION ON DELETE NO ACTION
)

--
Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Postgresql Jobs]     [Postgresql Admin]     [Postgresql Performance]     [Linux Clusters]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Postgresql & PHP]     [Yosemite]
  Powered by Linux