On Sat, Oct 11, 2008 at 1:10 PM, Matthew Wilson <matt@xxxxxxxxxx> wrote: > I need to track employees and their preferred locations, shifts, and > stations. > > For example, I need to track that Alice prefers to work the morning > shift at the west-side location, and she likes to work the cash-register > station. > > Also, I need to track that Bob likes the west-side and north-side > locations, likes the night shift, and likes the dishwasher station. Note > the one-to-many relationship between Bob and his preferred locations. I > need to support that possibility. > > So, I see two ways to make my tables, and I'd like some advice. > > FIRST METHOD: > > create table preferred_location ( > employee_id int references employee (id), > location_id int references location (id)); > > create table preferred_shift ( > employee_id int references employee (id), > shift int references shift (id)); > > create table preferred_station ( > employee_id int references employee (id), > station_id int references station (id)); > > SECOND METHOD: > > create table preferences ( > > employee_id int references employee (id), > other_table_name text, / > other_table_id int)); > > In the second method, I'd store tuples like this in the preferences > table: > > (<Alice's ID>, 'location', <west-side location ID>), > (<Alice's ID>, 'shift', <morning shift ID>) > (<Alice's ID>, 'station', <cash register station ID>) > > The nice thing about the second approach is I can extend this to store > all sorts of preferences as I dream them up. But on the downside, I > don't have any FK constraints. > > I suspect this is a pretty common dilemma. Any commentary from the > experts on this list is welcome. > > Thanks in advance! > > Matt I'm certainly not an expert, but hopefully my commentary will still be somewhat helpful. Your "method 2" is something called an Entity-Attribute-Value table design[1]. There was a discussion on this list a couple weeks ago about the merits and drawbacks of designing your tables this way. Honestly, it probably depends on what your ultimate needs are. As the "Downsides" section of the Wiki link [1] shows, most of the problems with EAV really start to emerge when the tables get huge and you're dealing with hundreds of thousands to millions of entities, each with potentially hundreds of attribute-value pairs. If you're intending to roll out your application for every Starbucks on your continent, that might start to be a problem. (From my experience, implementations like this over large data sets suffer a big performance hit and carry a lot of data integrity baggage.) If you're talking about something for use in your chain of three internet cafes around one town, and you aren't going to have more than a dozen Attributes per Entity, it probably doesn't matter, because the complications will be more manageable without screwing something up. That said, by going the EAV/"Method-2" route, you're gaining flexibility, but at the cost of increased complication, and ultimately repurposing a relational database to do something that isn't very database-like, that's really more like a spreadsheet. (So why not just use a spreadsheet?) You have little room for recording additional information, like ordering preferences, or indicating that (say) a station preference depends on a location preference, or that a shift time depends on day of the week, etc -- so you're probably not getting as much flexibility as you think. Sure, you could add an "Extra_Data" column, so you have rows: Marie-Location-West-1, Marie-Location-East-2, Marie-Shift-Evening-Tuesday, Marie-Station-Register-West, Marie-Shift-Morning-Sunday, etc. But you can see the data integrity nightmare already, when you somehow manage to record "Marie-Shift-Register-1". Not to mention that you'll have to do format conversions for that "Extra_Data" field, and incorporate logic somewhere else in your program that deciphers whatever's in the generic data field to come up with ordering preferences for locations, station preferences by shift times, or whatever else you want to store. Essentially, in my humble opinion, you're putting off the problem of thinking about the nature and structure of your data, and most importantly, what you're going to use that data for, when ultimately those are the decisions that should be guiding how you design and use the database. Particularly given that this sounds like a management-efficiency project rather than one that your business (I'm assuming it's your business) hinges upon, I am imagining that you have time to be sure about exactly what you want to do with the data. If you expect that your business needs will change dramatically over the lifetime of the product, or you don't have time to make these decisions now, then maybe the flexibility outweighs the drawbacks. Good luck! [1] See e.g. http://en.wikipedia.org/wiki/Entity-Attribute-Value_model -- Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general