Search Postgresql Archives

Re: performace review

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Title: Re: [GENERAL] performace review
Denormalization should reduce the number of joins and reduce the overall number of tables, yes?  And the idea is to fully normalize and then back off because of physical limitations in the database you're using *with full knowledge and understanding that you're sacrificing the relational model to do it*.  They obviously did not do that.  They just built a denormalized database.
 
Look at one of the views:
 
CREATE OR REPLACE VIEW kernel_view_027 AS
(( SELECT 0 AS object_rid, act.object_oid, act.object_idx, act0."p$$assigned_to__rid" AS "p$$object_parent__rid", act0."p$$assigned_to__oid" AS "p$$object_parent__oid", act.created_at, act.modified_at, act.created_by, act.modified_by, act.object__class, act.access_level_browse, act.access_level_update, act.access_level_delete, act."owner", act.name, act.description, act.percent_complete, act.priority, act.due_by, act.scheduled_start, act.scheduled_end, act.actual_start, act.actual_end, act.misc1, act.misc2, act.misc3, act.activity_state, act.activity_number, act.severity, act.assigned_to, act."p$$assigned_to__rid", act."p$$assigned_to__oid", ( SELECT acc.full_name
           FROM kernel_account acc
          WHERE acc.object_rid::text = act."p$$assigned_to__rid"::text AND acc.object_oid::text = act."p$$assigned_to__oid"::text AND acc.object_idx = 0) AS "p$$assigned_to_title"
   FROM kernel_activity act
   LEFT JOIN kernel_activity act0 ON act0.object_idx = 0 AND act.object_rid::text = act0.object_rid::text AND act.object_oid::text = act0.object_oid::text
UNION
 SELECT 0 AS object_rid, act.object_oid, act.object_idx, act0."p$$sender_parent__rid" AS "p$$object_parent__rid", act0."p$$sender_parent__oid" AS "p$$object_parent__oid", act.created_at, act.modified_at, act.created_by, act.modified_by, act.object__class, act.access_level_browse, act.access_level_update, act.access_level_delete, act."owner", act.name, act.description, act.percent_complete, act.priority, act.due_by, act.scheduled_start, act.scheduled_end, act.actual_start, act.actual_end, act.misc1, act.misc2, act.misc3, act.activity_state, act.activity_number, act.severity, act.assigned_to, act."p$$assigned_to__rid", act."p$$assigned_to__oid", ( SELECT acc.full_name
           FROM kernel_account acc
          WHERE acc.object_rid::text = act."p$$assigned_to__rid"::text AND acc.object_oid::text = act."p$$assigned_to__oid"::text AND acc.object_idx = 0) AS "p$$assigned_to_title"
   FROM kernel_activity act
   LEFT JOIN kernel_activity act0 ON act0.object_idx = 0 AND act.object_rid::text = act0.object_rid::text AND act.object_oid::text = act0.object_oid::text)
UNION
 SELECT 0 AS object_rid, act.object_oid, act.object_idx, p0."p$$party__rid" AS "p$$object_parent__rid", p0."p$$party__oid" AS "p$$object_parent__oid", act.created_at, act.modified_at, act.created_by, act.modified_by, act.object__class, act.access_level_browse, act.access_level_update, act.access_level_delete, act."owner", act.name, act.description, act.percent_complete, act.priority, act.due_by, act.scheduled_start, act.scheduled_end, act.actual_start, act.actual_end, act.misc1, act.misc2, act.misc3, act.activity_state, act.activity_number, act.severity, act.assigned_to, act."p$$assigned_to__rid", act."p$$assigned_to__oid", ( SELECT acc.full_name
           FROM kernel_account acc
          WHERE acc.object_rid::text = act."p$$assigned_to__rid"::text AND acc.object_oid::text = act."p$$assigned_to__oid"::text AND acc.object_idx = 0) AS "p$$assigned_to_title"
   FROM kernel_activity act
   JOIN kernel_activityparty p0 ON p0.object_idx = 0 AND p0."p$$object_parent__rid"::text = act.object_rid::text AND p0."p$$object_parent__oid"::text = act.object_oid::text)
UNION
 SELECT 0 AS object_rid, act.object_oid, act.object_idx, p0."p$$party_parent__rid" AS "p$$object_parent__rid", p0."p$$party_parent__oid" AS "p$$object_parent__oid", act.created_at, act.modified_at, act.created_by, act.modified_by, act.object__class, act.access_level_browse, act.access_level_update, act.access_level_delete, act."owner", act.name, act.description, act.percent_complete, act.priority, act.due_by, act.scheduled_start, act.scheduled_end, act.actual_start, act.actual_end, act.misc1, act.misc2, act.misc3, act.activity_state, act.activity_number, act.severity, act.assigned_to, act."p$$assigned_to__rid", act."p$$assigned_to__oid", ( SELECT acc.full_name
           FROM kernel_account acc
          WHERE acc.object_rid::text = act."p$$assigned_to__rid"::text AND acc.object_oid::text = act."p$$assigned_to__oid"::text AND acc.object_idx = 0) AS "p$$assigned_to_title"
   FROM kernel_activity act
   JOIN kernel_activityparty p0 ON p0.object_idx = 0 AND p0."p$$object_parent__rid"::text = act.object_rid::text AND p0."p$$object_parent__oid"::text = act.object_oid::text;
That's *horrible*.  There's typecasting on the join conditions to convert things to text!  If there's a join on it, not only should they already be the same type, there ought to be a foreign key constraint on it (even if it is a self-referencing table).  The silly thing UNIONs the exact same query four times in order to create a heirarchy!  And it has the gloriously descriptive name 'kernel_view_027' (all 19 views share this naming convention).
 
While creating the tables, I got three errors about an unknown datatype.  Yeah, that's great design.
 
110 of the 118 tables have these fields:
  object_rid varchar(200) NOT NULL,
  "p$$object_parent__rid" varchar(200),
  object_oid varchar(200) NOT NULL,
  "p$$object_parent__oid" varchar(200),
  object_idx int4 NOT NULL,
  object__class varchar(200),
  modified_at varchar(20),
  created_at varchar(20),
  modified_by varchar(20),
  created_by varchar(20),
  "owner" varchar(50),
  access_level_browse int4,
  access_level_update int4,
  access_level_delete int4,
 
And this key:
CONSTRAINT kernel_media_pkey PRIMARY KEY (object_rid, object_oid, object_idx)
 
Wow, yeah. "modified_at" and "created_at".  Those should definitely not be timestamps.  "owner".  Great field name, that.  The only keys that don't allow NULLs are the primaries.  Clearly each of the 110 tables will need all these fields, then.
 
Looking at the indexes, the vast majority of them are against one of these 14 universal fields, but only exist on some of the tables.
 
It's *badly designed*.  I've been at this for less than three months and *I* can see it.  They're using a relational database as an object database.  No wonder their performance sucks!
 
Brandon Aiken


From: Alexander Staubo [mailto:alex@xxxxxxxxxxxxxxx]
Sent: Sat 10/7/2006 5:59 PM
To: Brandon Aiken
Cc: Tomi NA; pgsql-general@xxxxxxxxxxxxxx
Subject: Re: [GENERAL] performace review

On Oct 7, 2006, at 23:44 , Brandon Aiken wrote:

> I download the db source (inside opencrx-1.9.1-
> core.postgresql-8.zip) and executed their three schema files, 
> dbcreate-indexes.sql, dbcreate-views.sql, dbcreate-tables.sql.  
> Each of the 118 tables has a three-field composite primary key of 
> 'PRIMARY KEY (object_rid, object_oid, object_idx)'.  object_rid and 
> object_oid are both VARCHAR(200).  There are *no* foreign key 
> constraints.  Each table has between 15 and 50 fields, with 25 
> looking about average.

To be fair, there are a bunch of indexes, but the number of indexes 
seems low compared to the number of fields.

> Gee, why to table joins take so long?  Maybe because a blind monkey 
> created the schema?  Normalized databases do tend to perform 
> better, so I hear.

*De*normalization is the traditional hack to speed up queries, 
because it reduces the need for joins.

Alexander.



[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Postgresql Jobs]     [Postgresql Admin]     [Postgresql Performance]     [Linux Clusters]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Postgresql & PHP]     [Yosemite]
  Powered by Linux