Hi Craig Ringer, Really appreciate a lot for your advice! This at least has cleared my doubt, which had been confused me for quite some time. Thanks and Regards Yan Cheng CHEOK --- On Fri, 1/22/10, Craig Ringer <craig@xxxxxxxxxxxxxxxxxxxxx> wrote: > From: Craig Ringer <craig@xxxxxxxxxxxxxxxxxxxxx> > Subject: Re: Extremely Slow Cascade Delete Operation > To: "Yan Cheng Cheok" <yccheok@xxxxxxxxx> > Cc: "Grzegorz Jaśkiewicz" <gryzman@xxxxxxxxx>, pgsql-general@xxxxxxxxxxxxxx > Date: Friday, January 22, 2010, 12:51 PM > Yan Cheng Cheok wrote: > > I try to create a following simple scenario, to > demonstrate cascade delete is rather slow in PostgreSQL. > > > > Can anyone help me to confirm? Is this my only machine > problem, or every PostgreSQL users problem? > > > > I create 1 lot. > > every lot is having 10000 unit > > every unit is having 100 measurement. > > 101 measurements per unit by the looks. But it doesn't much > matter. > > > test=> CREATE INDEX fk_unit_id_idx ON measurement > (fk_unit_id); > CREATE INDEX > Time: 3072.635 ms > > > Now suddenly everything is much faster: > > test=> delete from lot; > DELETE 1 > Time: 8066.140 ms > > > Before that index creation, every deletion of a unit > required a seqscan > of `measurement' to find referenced measurements. At 200ms > apiece, it > would've taken about half an hour to `delete from lot' on > my machine, > and smaller deletes took a proportional amount of time (ie > 20s for 100 > units). Now it takes 8 seconds to delete the lot. > > You just forgot to create an index on one of the foreign > key > relationships that you do a cascade delete on. > > BTW, Pg doesn't force you to do this because sometimes > you'd prefer to > wait. For example, you might do the deletes very rarely, > and not way to > pay the cost of maintaining the index the rest of the > time. > > (What I was personally surprised by is that it's no faster > to DELETE > FROM measurement; directly than to delete via LOT. I > would've expected a > seqscan delete of the table to be MUCH faster than all the > index-hopping > required to delete via lot. I guess the reason there's no > real > difference is because the whole dataset fits in cache, so > there's no > seek penalty. ) > > AFAIK, Pg isn't clever enough to batch foreign key deletes > together and > then plan them as a single operation. That means it can't > use something > other than a bunch of little index lookups where doing a > sequential scan > or a hash join might be faster. Adding this facility would > certainly be > an "interesting" project. Most of the time, though, you get > on fine > using index-based delete cascading, and you can generally > pre-delete > rows using a join on those rare occasions it is a problem. > > -- > Craig Ringer > -- Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general