Re: Inheritance vs. LIKE - need advice

William Bug <wb27@xxxxxxxxxx> · Wed, 10 Aug 2005 23:27:53 -0400

Once again, many many thanks Jeff for taking the time to think  
through these issues and provide your well-informed comments & opinions!

On Aug 10, 2005, at 4:09 PM, Jeff Davis wrote:

William Bug wrote:

As you say, both LIKE & INHERIT are a bit anemic and unnecessary,   
since
what they provide can be implemented via VIEWs and RULEs/  
TRIGGERs.  I'd

I'd like to point out that INHERITS is unique,

a point well taken.  I can remember how excited I was when Oracle  
first started providing object extensions (which provides richer  
capabilities than INHERITs currently does).  Of course, that was a  
good decade after PostgreSQL/post-Ingres had added INHERITs (http:// 
en.wikipedia.org/wiki/PostgreSQL).

although I'm not sure all
of the exact differences. The main difference that I see is that  
SELECT
reads from multiple tables, and to make it only read one you do SELECT
... ONLY.

LIKE is purely syntactic sugar. Use it whenever it saves time and/or
reduces confusion.

VIEWs, at least).  Being able to create  MATERIALIZED VIEWs would
probably help to remove any realtime  performance issues  
introduced by
requiring additional JOINs to  harvest this view of the data.  I  
can do
this myself via RULEs/ TRIGGERs or use the system being developed  
as an
extension to Pg -  The matview Project
(http://gborg.postgresql.org/project/matview/ projdisplay.php).

Agreed. Materialized views are a great way to get whatever performance
benefits you need from physical representation without imposing on the
logical layout.

For some reason a good document on the subject is on a server  
that's not
responding right now (hopefully temporarily!). Here's the google  
cache:
<http://66.102.7.104/search? 
q=cache:jaaXngt0hioJ:jonathangardner.net/PostgreSQL/ 
materialized_views/matviews.html+materialized+view+postgresql&hl=en>

Many thanks for this link.  I had grown quite addicted to  
Materialized Views in Oracle, especially when working on OLAP  
applications.  Though I've been a very happy convert to PostgreSQL  
(for about 2 years ago), I've really missed having MATERIALIZED VIEWs  
integrated into the core DDL SQL of the RDBMS.

I actually think the INHERITs & LIKE features of PostgreSQL - in   
their
simplicity - potentially offer a better means to providing  Object
properties to SQL relations, than the more complex, fully  realized
Object-Relational systems, which are inherently better  suited to the
task of providing a seamless persistence mechanism for  OOP code
objects.  I do hope its not true Pg's INHERIT & LIKE  features are
vestigial and will continue to be developed into the  future.  in the

I think I overstated what I meant in my previous email... it should be
more like "it's in a stasis" rather than "on life support". When PG
solves some of the table partitioning issues in future releases,  
you can
bet that those features will help complete the inheritance model.  
After
a while, INHERITS will also be merely a syntax for capabilities  
that are
available otherwise.

meantime, in light of what you say regarding the lack  of active
development and maintenance they are receiving, I'll  probably lay  
off
using them much for now.  As you say, heavy use of  INHERITs given  
the
current support given to this feature is more a  liability than a
convenience at this point.

Not too many people use INHERITS. I think it's fairly independent  
in the
code and probably not too many bugs appear, but if a bug creeps in,  
the
limited testing might not catch it. Maybe a brief scan of the -bugs  
list
might indicate whether inheritance is a source of problems, or benign
syntax that primarily depends on other database features which are
well-tested.

Both are excellent points.  I will certainly check the bug lists  
before getting too committed to using INHERITs, though, as you say,  
hopefully the implementation relies on other components in the system  
getting heavy use (and testing).

I wasn't clear about this in that initial post, but my references to
OOP techniques were just by way of analogy.  I wasn't really  
asking  the
question with a thought toward creating a model to mirror my OOP   
models
- to be simply a persistence mechanism.  There are many  wonderful

It was mainly just a warning that sometimes application algorithms  
tend
to mix with the data model.

I'm a VERY STRONG believer in keeping application requirements out of  
the logical data model - probably too much so sometimes.  In general,  
it has rarely served me wrong, when I've needed to go in and write a  
wholly separate application to the same underlying data.  I really  
appreciate your placing INHERITs in this context.  It would have  
completely slipped by me, though it should have been obvious.

So - the real point I was trying to make is:
    1) INHERIT & OOP inheritance: each provide a relatively   
efficient
means to model real-world objects with inherent parent- child,
hierarchical relations to one another;

True, many real-world entities have the famous "isa" relationship, and
that's valueable to take into account in the data model.

    2) LIKE & Interfaces(Java)/Mixins(Ruby): each provide a means to
share a set of pre-defined attributes/methods amongst a set of class
that otherwise have no inherent hierarchical relation to one another.

You are certainly correct - Interfaces (Java) are essentially a
compile-time enforcement mechanism.  Changes you make to an Interface
after it has been used to generate bytecodes (i.e., compile) for  
a  Java
class will not have any effect on the class definition until you
compile it again.  Interfaces provide a fairly simple - but  
effective  -
means to an end - for Java to provide for "inheriting" aspects from
more than one existing code entity without supporting multiple
inheritance (which Pg actually does support).  Mixins in Ruby, on the
other hand, don't really get added to a class until runtime (though
they are used at compile time to resolve function & variable calls).
If you add to a Mixin, you could actually use that new feature next
time you use a class whose definition file included that Mixin.  This
is largely due to the fact that Ruby is interpreted at runtime and  
is  a
very loosely typed language.

Definitely a tangent, but I think most people would consider Ruby
strongly typed. Consider:

$ ruby -e 'puts 1+"1"'
-e:1:in `+': String can't be coerced into Fixnum (TypeError)
        from -e:1
$ perl -e 'print 1+"1","\n";'
2

However, ruby is, like python, late-binding. That means a variable can
take on a value of any type, but it gets the type with the value. I
break it down kind of like this: Ruby and Python are strongly typed  
and
late binding; perl and PHP are weakly typed and late binding; C is
weakly typed and early binding; Java is strongly typed and early  
binding
(that isn't entirely true... dynamic binding is sort of like late
binding). That's not official or anything, just my opinion of the
languages that I use.

I really like your breakdown here.  You are absolutely right.  I most  
definitely meant "late binding", not "loosely typed".  As a very  
heavy user of Ruby - having received many a compiler error just like  
the one you site - I should have remembered that distinction.  In  
fact, it's Ruby's class typing that has made it such a powerful tool  
for the work I do.  It has also enabled me to take scripts I quickly  
work up in Ruby and port them straight to Java (where the bulk of my  
production code lives) with relative ease.

I don't think I was very clear about this, but I was asking these
questions about Pg's schema reusability mechanisms solely from the
vantage of how it might save me time in writing and maintaining  
the  DDL
SQL for my core relational model.  I was also interested in how  the
INHERITANCE mechanism might make certain SQL operations easier to
implement in the case of where you have real-world objects you are
modeling which possess an inherent hierarchical relation to one
another.  Finally, I wanted to know whether my thinking regarding   
when
to use LIKE over INHERIT - and visa versa - was correct given  their
current implementation in Pg.

I think you have the right idea about LIKE and INHERITS. I thought  
your
comment about how you would be unlikely to want to query "curatable
entities" was an insightful way of looking at it.

I think overall, the most important thing is flexibility. I always  
think
to myself: how much work will it take if I wanted to change, or add
applications accessing the same data set?

If your applications have "SELECT ... ONLY" in them, what effect will
that have on your ability to change the physical layout (I actually
don't know the answer to that. If you create a view on some relations,
whether they're parent relations or not, and you "SELECT ... ONLY" on
the view, is the ONLY ignored? Can views resemble a part of an
inheritance hierarchy?)?

All excellent, subtle questions that will greatly effect the  
implementation flexibility INHERITs provides.

I would recommend that if you use INHERITS, always have a view in
between so that your application aren't tied to the physical layout in
any way. Because a view is not treated like a physical table in
inheritance (you can't INHERIT a view), it's too closely tied to the
physical layout.

I'm not certain I understand what you mean here?  Are you  
recommending all application layer interaction with tables using  
INHERIT should be done via a VIEW intermediary?  If so, wouldn't the  
VIEW (built from a "SELECT ... ONLY...") then be as dependent on the  
fixed structure determined by the INHERITs relationship, as much as  
the application code would be?

When I say "LIKE" offers some efficiencies, I mean just what I say
above.  For instance, with the example I gave previously, if I have a
set of fields - curation_info (creation_curator TEXT, creation_date
TIMESTAMP, mod_curator TEXT, mod_date TIMESTAMP) - I want represented
in all 100 tables in my model, I save a great deal of SQL writing by
creating a curation_info table by including it in these tables via
LIKE.  The fact the LIKE included table is in no way linked to the
tables that include it after execution of the initial CREATE TABLE
statement makes this a bit less useful than it might otherwise  
be.   If
a change is made to the fields in the underlying 'curation_info'   
table,
you are correct, I can always use ALTER TABLE to update the  100  
tables
that included the fields from 'curation_info' later.   Since I  
can't use
LIKE in a ALTER TABLE statement, however, I'd have  to specify each
field to be changed in the ALTER statement for each  of the 100  
tables.
This forces me to write a lot more SQL than I'd  like to write.  Of
course, to ask this of the "LIKE" function is to  ask for more  
than you
get from a Java Interface and maybe it would  add unwanted
inefficiencies to the query planner/execution framework.

LIKE actually fills in the fields at table creation time exactly as if
you wrote the field names out yourself, so I don't follow the
planner/execution comment. But I see what you're saying about adding
attributes if you have a lot of relations involved. Originally I was
unsure whether you meant "performance efficiency", but it's  
apparent you
mean "efficiency of DBA's time".

Absolutely, yes.  Since I need to wear several hats - DBA, database  
programmer, Java GUI App architect/implementer - anything I can do to  
maximize my efficiency in any one role helps to ensure I have at  
least a little of my weekend time to myself.  :-)

Thanks very much for your insight on how the recent implementation of
TABLESPACEs might lay the groundwork for adding new efficiencies  
to  the
INDEXing and inherited CONSTRAINT implementation for INHERITed   
tables.
This would be a big plus.

Actually, I was talking about table partitioning, which is different
than a tablespace. A table space is a named storage "area" that you  
can
assign whole tables to (already implemented in 8.0). If you  
partition a
table, you break a table into pieces and those pieces can go to a
different place in physical storage. Partitioning is closely  
related to
INHERITS, because if they can implement constraints or keys across  
parts
of a table in different locations, they can do it for INHERITS also.

Sorry - you are right, of course.  I'm used to the Oracle  
implementation of TABLESPACEs, where these two separate issues are  
somewhat convolved together.

Regards,
    Jeff Davis

---------------------------(end of  
broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
       choose an index scan if your joining column's datatypes do not
       match

---------------------------(end of broadcast)---------------------------
TIP 6: explain analyze is your friend