2013/3/13 Ian Lawrence Barwick <barwick@xxxxxxxxx>: > 2013/3/12 Gauthier, Dave <dave.gauthier@xxxxxxxxx>: >> Hi: >> >> v9.0.1 on linux. >> >> I have a table with a column that is a csv. Users will select records based >> upon the existence of an element of the csv. There is an index on that >> column but I'm thinking that it won't be of much use in this situation. Is >> there a way to facilitate these queries? >> >> Example: >> >> create table foo (col0 text, col1 text); >> >> create index foo_col1 on foo (col1); >> >> insert into foo (col0,col1) values >> ('moe','aa,bbb,c'),('larry','xxxxx,bbb,yyy'),('curly','m,nnnn,oo'); >> >> now... >> >> select col0 from foo where <the csv element 'bbb' exists as a csv element of >> col1> >> >> >> Some attempts, which get the right answers, but which probably won't be very >> efficient... >> >> select col0 from foo where string_to_array('bbb','') <@ >> string_to_array(col1); >> >> select col0 from foo where ','||col1||',' like '%,bbb,%'; >> >> select col0 from foo where ((col1 like 'bbb,%') or (col1 like '%,bbb,%') or >> (col1 like '%,bbb')); >> >> Long shot, but I thought I'd ask anyway. > > A GIN index might do the trick: > > CREATE INDEX ix_col1_ix ON foo USING GIN(string_to_array(col1,',')); > > (This is assuming the CSV values can be cleanly converted to > an array using "string_to_array()"). > > You could then query it with: > SELECT col0 FROM foo WHERE string_to_array(col1,',') @> '{bbb}'::text[]; Just out of interest, I populated the table with around 1,000,000 rows of randomly generated data (three items of random upper case characters in col1), results with and without index below (using an untuned 9.2 installation on a laptop with a slow hard drive). Note that adding the index doubled the total table size, which might be something to watch out for if the table is very big and you have a lot of unique values in the "CSV" column. Regards Ian Barwick testdb=# SELECT * from foo where string_to_array(col1,',') @> '{PKRY}'::text[]; col0 | col1 --------+------------------- ARWC | JIJ,MBDVU,PKRY FUNWOA | JKEK,PKRY,MQFUQTJ PJGTD | KSO,HSTB,PKRY (3 rows) Time: 1325.536 ms testdb=# explain SELECT * from foo where string_to_array(col1,',') @> '{PKRY}'::text[]; QUERY PLAN ------------------------------------------------------------------ Seq Scan on foo (cost=0.00..28400.42 rows=5021 width=76) Filter: (string_to_array(col1, ','::text) @> '{PKRY}'::text[]) (2 rows) testdb=# CREATE INDEX ix_col1_ix ON foo using gin(string_to_array(col1,',')); CREATE INDEX Time: 170533.158 ms testdb=# ANALYZE foo; ANALYZE Time: 1431.665 ms testdb=# SELECT * from foo where string_to_array(col1,',') @> '{PKRY}'::text[]; col0 | col1 --------+------------------- ARWC | JIJ,MBDVU,PKRY FUNWOA | JKEK,PKRY,MQFUQTJ PJGTD | KSO,HSTB,PKRY (3 rows) Time: 0.906 ms testdb=# explain SELECT * from foo where string_to_array(col1,',') @> '{PKRY}'::text[]; QUERY PLAN ---------------------------------------------------------------------------- Bitmap Heap Scan on foo (cost=20.79..389.58 rows=101 width=24) Recheck Cond: (string_to_array(col1, ','::text) @> '{PKRY}'::text[]) -> Bitmap Index Scan on ix_col1_ix (cost=0.00..20.76 rows=101 width=0) Index Cond: (string_to_array(col1, ','::text) @> '{PKRY}'::text[]) (4 rows) Time: 0.377 ms -- Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general