On 30/05/11 20:53, Nick Raj wrote: > Hi, > > Cube code provided by postgres contrib folder. It uses the NDBOX structure. > On creating index, it's size increase at a high rate. Here's what I get on 8.4 with a cleaned up test case. It uses the original poster's data ( output_vehicle.sql ) and just automates adding rows in powers of 2 and checking the table size. (Quoted because it is the only way to stop Thunderbird wrapping text in plain text editing mode - argh!) > nrows | tablesize | totalsize | indexsize | bpr_row | bpr_total | bpr_index > -------+-----------+-----------+-----------+---------+-----------+----------- > 0 | 0 | 8192 | 8192 | | | > 2 | 8192 | 16384 | 8192 | 4096 | 8192 | 4096 > 4 | 8192 | 16384 | 8192 | 2048 | 4096 | 2048 > 8 | 8192 | 16384 | 8192 | 1024 | 2048 | 1024 > 16 | 8192 | 16384 | 8192 | 512 | 1024 | 512 > 32 | 8192 | 16384 | 8192 | 256 | 512 | 256 > 64 | 8192 | 16384 | 8192 | 128 | 256 | 128 > 128 | 16384 | 65536 | 49152 | 128 | 512 | 384 > 256 | 24576 | 122880 | 98304 | 96 | 480 | 384 > 512 | 49152 | 229376 | 180224 | 96 | 448 | 352 > 1024 | 90112 | 327680 | 237568 | 88 | 320 | 232 > 2048 | 180224 | 1376256 | 1196032 | 88 | 672 | 584 > 4096 | 352256 | 2228224 | 1875968 | 86 | 544 | 458 > 8192 | 696320 | 3751936 | 3055616 | 85 | 458 | 373 > 16384 | 1384448 | 13254656 | 11870208 | 84 | 809 | 724 > 19875 | 1679360 | 15466496 | 13787136 | 84 | 778 | 693 > (16 rows) As expected, the bytes cost per table row sans index (bpr_row) tends down slowly toward a stable value. The index size per row (bpr_index) is all over the place, but seems consistently pretty huge. At 19875 rows the index is 90% of the total size, or over 8 times the size of the table its self. While I realize that a general-purpose index for multi-dimensional structures like cubes may not be hugely efficient, is this kind of huge storage consumption expected? The memory hit will be so bad that it's likely to be faster to use full table scans. Run the attached test as: psql -f test.sql -q regress where "regress" is the name of the database to do the work in. The original poster's "output_vehicle.sql" must be in the same directory. -- Craig Ringer
-- You must inport the `cube' contrib module to use this code -- DROP TABLE IF EXISTS cubtest; -- CREATE TABLE cubtest (c cube); -- COMMENT ON TABLE cubtest IS 'Holds raw cube data import; lacks indexes'; -- \i output_vehicle.sql CREATE LANGUAGE 'plpgsql'; DROP TABLE IF EXISTS cubtest2, sizes CASCADE; CREATE TABLE cubtest2 (c cube); CREATE INDEX t on cubtest2 using gist(c); COMMENT ON TABLE cubtest2 IS 'Indexed table that is progressively added to during testing'; CREATE TABLE sizes (nrows INTEGER, tablesize INTEGER, totalsize INTEGER, indexsize INTEGER, bpr_row float, bpr_total float, bpr_index float); CREATE OR REPLACE FUNCTION get_sizes(OUT nrows INTEGER, OUT tablesize_rows INTEGER, OUT tablesize_total INTEGER, OUT tablesize_index INTEGER, OUT bytes_per_row float, OUT bytes_per_row_total float, OUT bytes_per_row_index float) AS $$ BEGIN tablesize_rows = pg_relation_size('cubtest2'); tablesize_total = pg_total_relation_size('cubtest2'); tablesize_index = tablesize_total - tablesize_rows; nrows = (select count(*) from cubtest2); IF nrows <> 0 THEN bytes_per_row = tablesize_rows/ nrows; bytes_per_row_total = tablesize_total / nrows; bytes_per_row_index = tablesize_index / nrows; END IF; END; $$ LANGUAGE 'plpgsql'; CREATE OR REPLACE FUNCTION size_test() RETURNS SETOF sizes AS $$ DECLARE cubtest_rows INTEGER; inserted_rows INTEGER; o INTEGER; BEGIN INSERT INTO sizes SELECT * FROM get_sizes(); cubtest_rows = (SELECT count(*) FROM cubtest); INSERT INTO cubtest2 SELECT * FROM cubtest OFFSET 0 LIMIT 1; cubtest_rows = (SELECT count(*) FROM cubtest); inserted_rows = 1; WHILE inserted_rows < cubtest_rows LOOP INSERT INTO cubtest2 SELECT * FROM cubtest OFFSET inserted_rows LIMIT inserted_rows; INSERT INTO sizes SELECT * FROM get_sizes(); inserted_rows = inserted_rows * 2; END LOOP; RETURN QUERY SELECT * FROM sizes; END; $$ LANGUAGE 'plpgsql'; SELECT * from size_test();
-- Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general