Search Postgresql Archives

Re: [SQL] (Ab)Using schemas and inheritance

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Jorge Godoy wrote:
Em Quarta 24 Maio 2006 11:48, Alban Hertroys escreveu:

Jorge Godoy wrote:

Some things are really important here:

	- performance for operations on an individual company --- it has to be
	  as fast as possible because this might be used by my client's clients.

I suppose inherited tables would be faster. Both at read and write
operations.

However, I think that partitioning/clustering the table over an index on
company_id would help a lot. Also, considering you need to keep a lot of
old data around, you may be able to split your data in archived and
recent data, keeping your indices small (partial indices maybe).


But this could be easily done with two commands (besides creating indices again):

ALTER TABLE x RENAME TO x_year;
CREATE TABLE x () INHERITS FROM base.x;

No need to change the rest...

CREATE TABLE sample ();
ALTER TABLE sample RENAME TO sample_2004;
CREATE TABLE sample () INHERITS FROM sample_2004;
ALTER TABLE sample RENAME TO sample_2005;
CREATE TABLE sample () INHERITS FROM sample_2005;

You really mean that yearly doubling inheritance? I've been told (admittedly by people I don't give much credit) that inheriting tables works by joining the underlying tables, which could seriously affect performance after a few years of doing this.

Did you try this? It seems to be important.

Considering this case with views, you could:

CREATE INDEX sample_2004_idx ON sample (object_id) WHERE date BETWEEN '2004-01-01'::date AND '2004-12-31'::date;

CREATE INDEX sample_2005_idx ON sample (object_id) WHERE date BETWEEN '2005-01-01'::date AND '2005-12-31'::date;

etc.

You could extend that to also take company_id into account and put the indices on the base tables instead. I figure that takes less maintenance.

Mind you, to make use of those indices, you need date somewhere in your where-clauses.

In your inheritence case, if you have tables covering this year only (that of course inherit from the complete data set somehow), you could use the ONLY statement in your FROM clauses. You wouldn't need to bother with date.

	- safety: if it becomes available externally, then one client could
never see other client's data.  This is a main concern.  If used only
internally it is desirable that only authorized employees view each
client since all of them doesn't need to access all clients even if
working directly with accounting.

Safety would be about equal to your inheritence solution. You can GRANT
rights on the views, that can be different from the access rights on the
base tables. The views must evidently have access to your base tables.

Of course. This way, to keep the view working, I'd have to use SELECT INTO when archiving data, right? IIRC, views would still refer to the old table if I rename them (this is a problem on my solution as well if I can't use a common view and make it respect the search_path, as I described on the other message).

By archiving you mean moving yearly data to disk via a temporary table or some such? In that case, yes, probably.

For yearly archiving, dumping one of those inherited tables (containing only data for a specific year, after all) could work. Never tried that before.

	- easy to code on application side: other systems will be plugged to
this database.  The more we can do to avoid other people mistakes, the
better.

Views might be a bit easier there, as you can add derived data to your
results (like first name, infix and surname concatenated, dates in a
format specific for a companys' locale, etc.).

I'll have the full data on each schema, so this is also possible.

Okay, no client data processing required, then. Nice.

	- easy to maintain database: if it is too painful, hacks will come and
this is not a good plan before starting the project...

This was my main reason to suggest views. For select you could probably
get away with SELECT * FROM table WHERE company_id = x; (Though SELECT *
is being frownde upon).

Instead of doing SELECT * FROM schema.table;  (no filter here)...

Indeed.

Hmm... thinking about this, I realize that in your case selecting all data in a year could be faster. The planner would certainly choose a sequential scan plan, while with all data in a single table (with views over them) may trigger an index scan, which would probably be slower.

Depends on whether you're likely to do that, of course.

The update/insert/delete parts are also all rather similar. You may be
able to generate REPLACE VIEW statements if the base table layout
changes. You'd probably need to DROP and re-CREATE them, but being views
your data is not at risc. Nothing physically changes in your tables.
Hmm... Almost forgot about the RULEs that make them updatable... Those'd
need updating too, of course...

Is it at risk with inherited tables?

Well, the data is inside those tables, where it's not with views. If you accidentaly DROP a column in a table, the data is gone. With a view you just don't see it anymore, but it's still there.

With inherited tables you would need a way to dump and restore their
contents, or do a whole lot of ALTER TABLE statements. You'd be
physically altering the tables that contain your data; there's more than
zero risc involved. You can't rollback DDL statements...

Why? If I add a column to the base table then this new column appears on all inherited tables. And as I can't rollback DDL statements, I won't be risking breaking things when I change the view ;-)

Oh right, I forgot about that. But what happens if you want to change a columns data type for example? I'd add a new column with the new type, update it with the data in the column I want to change, drop the original column and rename it. What happens if you do that when tables inherit the one you're working on?

Also note that foreign keys to an inherited table don't work properly. You get foreign key violations, because the constraint is looking at the wrong table, IIRC.

OTOH, with inherited tables you can have company-tables that differ from
the base tables for a specific company; with views that'd be... difficult.

I'd have to have an exclusive view and use auxiliar tables if I wanted that with views... For now, from what we've discussed with the client, this won't be the case. But it is something that we should think about for future expansions... If there aren't too many exceptions, then we can deal with

Sure, they always say that ;)

that somewhat cleanly on both sides with the approach of the auxiliar table and exclusive view, but if there are too many exceptions adapting the table might be better (both are "hard" if there are too many exceptions...). I haven't gotten into this planning level yet.

You may be able to achieve some customization by inheriting from the companies inherited tables again, adding a few columns. It's still a pain, of course, as the client code will need to be customized too to take advantage of the different table definitions.

Am I right or completely wrong? :-)

Have you considered downtime if a schema change is necessary? What would
cause the most; tables or views?

What do you mean by a schema change? If we go this route, then this will have to be a core part of the database and application design. Changes here will be critical, so there shouldn't be changes or they should be done incrementally. Bigger changes will require a maintenance routine that will probably take the system offline for some time (from minutes to hours).

We're studying what to do to be able to plan things like that.

Well, I know customers... They always want something to be different in the end.

I'm pretty sure VIEWs will be faster if you need to DROP/CREATE the
tables, as you'll have to move your data around not to lose it... Of
course, you'd need to do the same for your base tables, that contain
_all_ your data in the updatable view case...

As I said, I was thinking along the lines of renaming tables to archive them when needed (weekly, monthly, quarterly, yearly) and inheriting from "base" schema again. This looks like a very fast operation...

Do you believe it will be problematic?

What I believe it to be isn't really significant, neither of us has much experience using table inheritence or updatable views for things like these, it seems.

One thing that comes to mind now is the timetravel contrib package. That's another thing I plan to look into some time, but it may suit your needs.

--
Alban Hertroys
alban@xxxxxxxxxxxxxxxxx

magproductions b.v.

T: ++31(0)534346874
F: ++31(0)534346876
M:
I: www.magproductions.nl
A: Postbus 416
   7500 AK Enschede

// Integrate Your World //


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Postgresql Jobs]     [Postgresql Admin]     [Postgresql Performance]     [Linux Clusters]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Postgresql & PHP]     [Yosemite]
  Powered by Linux