Other option is use an array of int2 instead of bit(256). It can be indexed. 2011/9/14, Radosław Smogura <rsmogura@xxxxxxxxxxxxxxx>: > On Wed, 14 Sep 2011 12:00:35 +0200, Antonio Vieiro wrote: >> Hi all, >> >> One of my entities 'E' may be 'tagged' with an arbitrary set of 256 >> tags 'T'. >> >> A first approach could be to add a M:N relationship between 'E' and >> 'T'. >> >> A second way to do this could be to add a BIT(256) datatype to 'E', >> setting bits to '1' if the entity is tagged with each one of the 256 >> tags (i.e. using a 'bitmask' on the set of tags). >> >> Since querying entities 'E' with a certain set of tags 'T' must be >> very fast I was wondering if the second approach would be faster. >> What >> do you think? >> >> Thanks for any hints, >> Antonio > > I assume each entity may have one or more different tags. > > Actually performing test like > ... where (t.bits & :mymask) = :mymask > should be quite fast and faster then creating additional relations, but > only if it's highly probable that your query will almost always scan > whole table. > > The advantage of indexes is that the index is used 1st and tail > (slower) parts of query will always get "subset" of table. In bitset, > You will probably scan whole table. > > So I think, you should do some performance test for large number of > data, and compare both ways. I think bitset will be fast for really > small data, but M:N relations may be faster for really large data sets. > > You need to measure size of your database too, in M:N case with 256 > tags it may be quite large. > > -- > Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-general > -- ------------ pasman -- Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general