Search Postgresql Archives

Re: What's the best way in Postgres9 to store a bunch of arbitrary boolean values for a row?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Do not fall too easilly into hstore :-)

while it sometimes good and extremely easy to setup, it has some drawbacks
- querying and searching has some limitations (keys/values not easily indexable, horrible syntax)
- storage not optimised (one hstore field = several dozens of boolean columns)
- only text values, no data type validation

I'd recommend option (4) - normalize!

FR


2011/12/5 Mike Christensen <mike@xxxxxxxxxxxxx>
>> I have a database full of recipes, one recipe per row.  I need to
>> store a bunch of arbitrary "flags" for each recipe to mark various
>> properties such as Gluton-Free, No meat, No Red Meat, No Pork, No
>> Animals, Quick, Easy, Low Fat, Low Sugar, Low Calorie, Low Sodium and
>> Low Carb.  Users need to be able to search for recipes that contain
>> one or more of those flags by checking checkboxes in the UI.
>>
>> I'm searching for the best way to store these properties in the
>> Recipes table.
>
> I'd use hstore to store them as tags. You can then use hstore's GiST index
> support to get quick lookups.
>>
>> 1. Have a separate column for each property and create an index on
>> each of those columns.  I may have upwards of about 20 of these
>> properties, so I'm wondering if there's any drawbacks with creating a
>> whole bunch of BOOL columns on a single table.
>
> It'll get frustrating as you start adding new categories, and will drive you
> insane as soon as you want to let the user define their own categories -
> which you will land up wanting to do in your problem space. I'd avoid it.
>>
>> 2. Use a bitmask for all properties and store the whole thing in one
>> numeric column that contains the appropriate number of bits.  Create a
>> separate index on each bit so searches will be fast.
>
> Same as above, it'll get annoying to manage when you want user tagging.
>>
>> 3. Create an ENUM with a value for each tag, then create a column that
>> has an ARRAY of that ENUM type.  I believe an ANY clause on an array
>> column can use an INDEX, but have never done this.
>
> Same again.
>>
>> 4. Create a separate table that has a one-to-many mapping of recipes
>> to tags.  Each tag would be a row in this table.  The table would
>> contain a link to the recipe, and an ENUM value for which tag is "on"
>> for that recipe.  When querying, I'd have to do a nested SELECT to
>> filter out recipes that didn't contain at least one of these tags.  I
>> think this is the more "normal" way of doing this, but it does make
>> certain queries more complicated - If I want to query for 100 recipes
>> and also display all their tags, I'd have to use an INNER JOIN and
>> consolidate the rows, or use a nested SELECT and aggregate on the fly.
>
> That'll get slow. It'll work and is IMO better than all the other options
> you suggested, but I'd probably favour hstore over it.

The hstore module sounds fantastic!

I'm curious as to how these columns are serialized back through the
driver, such as Npgsql.  Do I get the values as strings, such as a
comma delimited key/value pair list?  Or would I need to do some
custom logic to deserialize them?

Right now, I'm using Npgsql as a driver, and NHibernate/Castle
ActiveRecord as an ORM.

Mike

--
Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Postgresql Jobs]     [Postgresql Admin]     [Postgresql Performance]     [Linux Clusters]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Postgresql & PHP]     [Yosemite]
  Powered by Linux