On 2011-01-16, Randall Smith <randall@xxxxxx> wrote: > Before reading. This is solved. Was an error on my part. > > On Sun, 2011-01-16 at 03:46 +0000, Jasen Betts wrote: >> In plpgsql IF is an implicit select. >> <http://www.postgresql.org/docs/8.4/interactive/plpgsql-expressions.html> >> >> IF EXISTS (SELECT 1 FROM t1 WHERE >> volume_id = NEW.volume_id AND name = NEW.name) THEN >> RAISE EXCEPTION '% already exists on volume', NEW.name; >> END IF; > > Thanks. Certainly more concise. > >> >> > 0.053 ms/record / 1,000 ms/sec * 10,000 records = .53 seconds >> >> huh? >> >> > According to that stat, this lookup should be adding about 0.5 seconds >> > to 10,000 records, >> >> why? what are you doing to 10000 records. > > Inserting them. Sorry, that was a critical omission on my part. The > trigger check is for inserts and I'm testing its performance by > inserting 10,000 records. that'll have n.log(n) complexity (or worse) you can't simply multiply by 10000 because each row added to the index slows the average index lookup time a little. > Turns out my EXPLAIN ANALYZE times were so low because I was querying > for a volume_id that had only a few rows. When I query for a volume_id > with 10,000 rows, it changes to 7 ms, which matches the performance I'm > seeing. That's acceptable to me because that's probably at the upper > end of what I'll see. 7 ms to check 10,000 text fields is actually > impressive to me. >> how much disk (in bytes, and dollars) are you hoping to save by not >> using the index. > I need to be able to fit my indexes in RAM. why? Indices that don't fit are still useful. Doing 20 or so record retreivals to confirm the absense of a record is likely to be slower than doing 5 reads and a write or two to check and update a disk-based index. postgres isn't going to blindly load all the indices into ram. > I need to be able to fit my indexes in RAM. This table will have a few > billion records These text fields can > be up to 1k each. this query will create about 300000 sample records with 1K text repeat it with dufferent numbers in the first generate_series 10001,20000 20001,30000 etc until you get 2 billion records then create your indices and do your testing. insert into t1 select a::integer as volume_id, md5((a*b)::text) || md5((a*b+1)::text) || md5((a*b+3)::text) || md5((a*b+4)::text) || md5((a*b+5)::text) || md5((a*b+6)::text) || md5((a*b+7)::text) || md5((a*b+8)::text) || md5((a*b+9)::text) || md5((a*b+11)::text) || md5((a*b+12)::text) || md5((a*b+13)::text) || md5((a*b+14)::text) || md5((a*b+15)::text) as name from (select generate_series( 1 ,10000) as a ,generate_series(1,29) as b) as foo; -- ââ 100% natural -- Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general