Le 5 décembre 2011 10:04, Albe Laurenz <laurenz.albe@xxxxxxxxxx> a écrit : > Mike Christensen wrote: >> I have a database full of recipes, one recipe per row. I need to >> store a bunch of arbitrary "flags" for each recipe to mark various >> properties such as Gluton-Free, No meat, No Red Meat, No Pork, No >> Animals, Quick, Easy, Low Fat, Low Sugar, Low Calorie, Low Sodium and >> Low Carb. Users need to be able to search for recipes that contain >> one or more of those flags by checking checkboxes in the UI. >> >> I'm searching for the best way to store these properties in the >> Recipes table. My ideas so far: >> >> 1. Have a separate column for each property and create an index on >> each of those columns. I may have upwards of about 20 of these >> properties, so I'm wondering if there's any drawbacks with creating a >> whole bunch of BOOL columns on a single table. >> 2. Use a bitmask for all properties and store the whole thing in one >> numeric column that contains the appropriate number of bits. Create a >> separate index on each bit so searches will be fast. >> 3. Create an ENUM with a value for each tag, then create a column that >> has an ARRAY of that ENUM type. I believe an ANY clause on an array >> column can use an INDEX, but have never done this. >> 4. Create a separate table that has a one-to-many mapping of recipes >> to tags. Each tag would be a row in this table. The table would >> contain a link to the recipe, and an ENUM value for which tag is "on" >> for that recipe. When querying, I'd have to do a nested SELECT to >> filter out recipes that didn't contain at least one of these tags. I >> think this is the more "normal" way of doing this, but it does make >> certain queries more complicated - If I want to query for 100 recipes >> and also display all their tags, I'd have to use an INNER JOIN and >> consolidate the rows, or use a nested SELECT and aggregate on the fly. >> >> Write performance is not too big of an issue here since recipes are >> added by a backend process, and search speed is critical (there might >> be a few hundred thousand recipes eventually). I doubt I will add new >> tags all that often, but I want it to be at least possible to do >> without major headaches. > > I would use a boolean column per property and a partial index on the > ones > where the property is selective, i.e. only a small percentage of all > recipes > match the property. I would like to recommend to have a look at Bloom Filtering: http://www.sai.msu.su/~megera/wiki/bloom -- Cédric Villemain +33 (0)6 20 30 22 52 http://2ndQuadrant.fr/ PostgreSQL: Support 24x7 - Développement, Expertise et Formation -- Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general