Search Postgresql Archives

Re: Research and EAV models

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Peter,

I agree 100% with you. EAV can be a good "middle of the road" appoach as you suggest.

Peter Hunsberger wrote:
My take on this, for the research world, is to not go pure EAV, but
rather normalize by some more generic concepts within the domain.  Eg.
"measurement", or "evaluation", etc. You might ultimately end up with
a sort of EAV model, but the "V" portion is strongly typed within the
database and you're not trying to cast a string into 20 conventional
data types. This still requires rigorous metadata management on the EA
side of the EAV model, but you can tackle that in many ways.

SQL isn't the be-all and end-all of data storage.  It does relational
stuff well, and other stuff poorly.

You can build variations on EAV that are closer to a regular
relational schema.  These don't necessarily work well or poorly but
often, at least in the research world, the middle ground is good
enough.  You are after all, talking about people who spit out MySQL
databases at the drop of a hat....

I use a very similar approach in managing meta-data, normalize the data that can be normalized and use EAV for the rest. Potentially eliminating as much as possible text search, however in some scenarios it might be necessary but an additional where on some normalized columns can help a lot with performance.

One of my application meta-data frameworks uses only two tables to store all meta-data about an application and have basically the following structure:

CREATE TABLE controls (
ctrl_no SERIAL PRIMARY KEY NOT NULL,
app_id varchar(30) NOT NULL,
ctrl_type varchar(30) NOT NULL,
ctrl_id varchar(30) NOT NULL,
ctrl_property text, -- This can be also hstore to add some intelligence
CONSTRAINT controls_unique UNIQUE (app_id, ctrl_type, ctrl_id));

CREATE TABLE members (
ctrlmember_no SERIAL PRIMARY KEY NOT NULL,
ctrl_no INTEGER NOT NULL REFERENCES controls(ctrl_no),
member_no INTEGER NOT NULL REFERENCES controls(ctrl_no),
member_type varchar(30) NOT NULL,
member_property text, -- This can be a hstore to add more intelligence
CONSTRAINT member_unique UNIQUE (ctrl_no, member_no));

ctrl_property is used to store meta-data based on ctrl_type.

member_property stores meta-data based on member_type and/or overriding default ctrl_property values based on the parent ctrl_no it is associated with.

Without this approach I would have to create more than 100 tables if I want to fully normalize this meta-data.

Many people have indicated that I am actually duplicating many of the catalog features of a relational database, and I agree. However, it allows me to port this meta-data onto any user specified RDBMS without having to worry about database specifics.

With two [recursive] queries on the above two tables I can answer most of the fundamental questions regarding the meta-data:

1. Show me the members (parent -> child) of a specific feature.
2. Show me the owners (child -> parent) of a specific feature.

Extending the above, it allows for an easy development plan for writing a generic application framework that not only manages the meta-data, but also allows the same framework to run/create a user interface on the fly by a couple of nuances on the above two queries and using the EAV (%_property) columns to supply all the default properties and behaviour of the application.

Changing the behavior of an application becomes primarily a database management issue, a lot less application upgrade management in a distributed environment.

To come back to the original message. Yes there are a place for EAV, not only in Research but even in Business data. I have a Environmental software scenario, and EAV on the business data provide me the edge against my competitors. Lot less time needed to implement new features compared to doing the normal functional decomposition and system development life cycle.

In conclusion, I include a extract from an article by Dan Appleton (Datamation, 1983) that my approach is based on:

“The nature of end-user software development and maintenance will change radically over the next five years simply because 500 000 programmers will not be able to rewrite $400 billion of existing software (which is hostage to a seven- to 10-year life cycle). They'll be further burdened by those new applications in the known backlog, as well as by those applications in the hidden backlog. To solve the problem, dp (data processing) shops must improve productivity in generating end-user software and provide end-users with the means of generating their own software without creating anarchy... The answer to this is a data-driven (meta-data) prototyping approach, and companies that do not move smoothly in this direction will either drown in their own information pollution or loose millions on systems that are late, cost too much, and atrophy too quickly.”

Regards,

Johan Nel
Pretoria, South Africa.

--
Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Postgresql Jobs]     [Postgresql Admin]     [Postgresql Performance]     [Linux Clusters]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Postgresql & PHP]     [Yosemite]
  Powered by Linux