Re: Extremely Slow Cascade Delete Operation

Yan Cheng Cheok <yccheok@xxxxxxxxx> · Thu, 21 Jan 2010 22:16:06 -0800 (PST)

Hi Craig Ringer,

Really appreciate a lot for your advice! This at least has cleared my doubt, which had been confused me for quite some time.

Thanks and Regards
Yan Cheng CHEOK

--- On Fri, 1/22/10, Craig Ringer <craig@xxxxxxxxxxxxxxxxxxxxx> wrote:

> From: Craig Ringer <craig@xxxxxxxxxxxxxxxxxxxxx>
> Subject: Re:  Extremely Slow Cascade Delete Operation
> To: "Yan Cheng Cheok" <yccheok@xxxxxxxxx>
> Cc: "Grzegorz Jaśkiewicz" <gryzman@xxxxxxxxx>, pgsql-general@xxxxxxxxxxxxxx
> Date: Friday, January 22, 2010, 12:51 PM
> Yan Cheng Cheok wrote:
> > I try to create a following simple scenario, to
> demonstrate cascade delete is rather slow in PostgreSQL.
> > 
> > Can anyone help me to confirm? Is this my only machine
> problem, or every PostgreSQL users problem?
> > 
> > I create 1 lot.
> > every lot is having 10000 unit
> > every unit is having 100 measurement.
> 
> 101 measurements per unit by the looks. But it doesn't much
> matter.
> 
> 
> test=> CREATE INDEX fk_unit_id_idx ON measurement
> (fk_unit_id);
> CREATE INDEX
> Time: 3072.635 ms
> 
> 
> Now suddenly everything is much faster:
> 
> test=> delete from lot;
> DELETE 1
> Time: 8066.140 ms
> 
> 
> Before that index creation, every deletion of a unit
> required a seqscan
> of `measurement' to find referenced measurements. At 200ms
> apiece, it
> would've taken about half an hour to `delete from lot' on
> my machine,
> and smaller deletes took a proportional amount of time (ie
> 20s for 100
> units). Now it takes 8 seconds to delete the lot.
> 
> You just forgot to create an index on one of the foreign
> key
> relationships that you do a cascade delete on.
> 
> BTW, Pg doesn't force you to do this because sometimes
> you'd prefer to
> wait. For example, you might do the deletes very rarely,
> and not way to
> pay the cost of maintaining the index the rest of the
> time.
> 
> (What I was personally surprised by is that it's no faster
> to DELETE
> FROM measurement; directly than to delete via LOT. I
> would've expected a
> seqscan delete of the table to be MUCH faster than all the
> index-hopping
> required to delete via lot. I guess the reason there's no
> real
> difference is because the whole dataset fits in cache, so
> there's no
> seek penalty. )
> 
> AFAIK, Pg isn't clever enough to batch foreign key deletes
> together and
> then plan them as a single operation. That means it can't
> use something
> other than a bunch of little index lookups where doing a
> sequential scan
> or a hash join might be faster. Adding this facility would
> certainly be
> an "interesting" project. Most of the time, though, you get
> on fine
> using index-based delete cascading, and you can generally
> pre-delete
> rows using a join on those rare occasions it is a problem.
> 
> --
> Craig Ringer
> 

-- 
Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general