I've configured 2 table like this
CREATE TABLE "public"."User_Statement_Pivot" (
"Email" varchar(50),
"UserId" varchar(50),
"ShortId" varchar(50),
"LastDirectJobMailSentDateTime" int8,
"What" varchar(4096),
"Where" varchar(4096)
)
WITH (OIDS=FALSE)
;
ALTER TABLE "public"."User_Statement_Pivot" OWNER TO "postgres";
CREATE INDEX "IX_btree_usp_userIdShortIdEmailLastDJMSDT" ON "public"."User_Statement_Pivot" USING btree ("UserId", "ShortId", "Email", "LastDirectJobMailSentDateTime");
CREATE INDEX "ix_fulltext_usp_what" ON "public"."User_Statement_Pivot" ("to_tsvector('italian'::regconfig, ""What""::text)", "to_tsvector('italian'::regconfig, ""What""::text)");
CREATE INDEX "ix_fulltext_usp_what_en" ON "public"."User_Statement_Pivot" ("to_tsvector('english'::regconfig, ""What""::text)", "to_tsvector('english'::regconfig, ""What""::text)");
CREATE INDEX "ix_fulltext_usp_where" ON "public"."User_Statement_Pivot" ("to_tsvector('italian'::regconfig, ""Where""::text)", "to_tsvector('italian'::regconfig, ""Where""::text)");
CREATE INDEX "ix_usp_what" ON "public"."User_Statement_Pivot" USING btree ("What");
CREATE TABLE "public"."User_Statement_Pivot_2" (
"Email" varchar(50),
"UserId" varchar(50),
"ShortId" varchar(50),
"LastDirectJobMailSentDateTime" int8,
"Where" varchar(4096),
"tsv" tsvector
)
WITH (OIDS=FALSE)
;
ALTER TABLE "public"."User_Statement_Pivot_2" OWNER TO "postgres";
CREATE INDEX "IX_btree_usp2_userIdShortIdEmailLastDJMSDT" ON "public"."User_Statement_Pivot_2" USING btree ("UserId", "ShortId", "Email", "LastDirectJobMailSentDateTime");
CREATE INDEX "textsearch_tsv" ON "public"."User_Statement_Pivot_2" ("tsv");
CREATE TRIGGER "tsvectorupdate" BEFORE INSERT OR UPDATE ON "public"."User_Statement_Pivot_2"
FOR EACH ROW
EXECUTE PROCEDURE "tsvector_update_trigger"('tsv', 'pg_catalog.italian', 'What');
Column "What" (table User_Statement_Pivot is just a single word or max 2 words separeted by space " " (ex: programmatore .NET), and tsv (table User_Statement_Pivot_2) is populate by materializing column with a ts_vector of "What".
Now if i perform those 2 queries
SELECT * FROM "User_Statement_Pivot_2"
where tsv @@ to_tsquery('italian','programmatore|analista')
SELECT * FROM "User_Statement_Pivot"
where to_tsvector('italian', tsv) @@ to_tsquery('italian','programmatore|analista')
Records on Tables (are same) like 8 milion.
Execution time of 1st query is 2 seconds (result set like 13.027)
Execution time of 2st query is 3 seconds (result set like 13.027) same records
Those are query analize
Bitmap Heap Scan on "User_Statement_Pivot" (cost=1025.27..109801.47 rows=76463 width=88) (actual time=3.186..12.608 rows=13027 loops=1)
Recheck Cond: (to_tsvector('italian'::regconfig, ("What")::text) @@ '''programm'' | ''anal'''::tsquery)
-> Bitmap Index Scan on ix_fulltext_usp_what (cost=0.00..1006.16 rows=76463 width=0) (actual time=2.315..2.315 rows=13027 loops=1)
Index Cond: (to_tsvector('italian'::regconfig, ("What")::text) @@ '''programm'' | ''anal'''::tsquery)
Total runtime: 12.972 ms
Bitmap Heap Scan on "User_Statement_Pivot_2" (cost=205.46..43876.92 rows=15068 width=102) (actual time=3.135..18.141 rows=13027 loops=1)
Recheck Cond: (tsv @@ '''programm'' | ''anal'''::tsquery)
-> Bitmap Index Scan on textsearch_tsv (cost=0.00..201.69 rows=15068 width=0) (actual time=2.254..2.254 rows=13027 loops=1)
Index Cond: (tsv @@ '''programm'' | ''anal'''::tsquery)
Total runtime: 18.502 ms
Configuration
PostgreSQL 9.0.4 on x86_64-unknown-linux-gnu, compiled by GCC gcc (GCC) 4.1.2 20080704 (Red Hat 4.1.2-46), 64-bit
If i increase words number in to_tsquery in OR condition those 2 queries are more different (exponentially). I don't understand why a materialized column is more slow than a calculeted one.
Luigi Saggese
Analyst Developer
Work: +39 328 75 16 236
Email: luigisag@xxxxxxxxx
IM: luigisaggese (Skype)
http://it.linkedin.com/in/luigisaggese
Email: luigisag@xxxxxxxxx
IM: luigisaggese (Skype)
http://it.linkedin.com/in/luigisaggese