Re: How many insert + update should one transaction

"Jim C. Nasby" <jnasby@xxxxxxxxxxxxx> · Sat, 24 Sep 2005 16:22:22 -0500

Just remember the first rule of performance tuning: don't.

Unless you *know* having the dead rows will be an issue, you will almost
certainly be best off going the simple, straightforward route.

On Fri, Sep 23, 2005 at 10:49:00AM -0500, Scott Marlowe wrote:
> On Fri, 2005-09-23 at 05:51, Yonatan Ben-Nes wrote:
> > Hi all,
> > 
> > Every few days I need to DELETE all of the content of few tables and 
> > INSERT new data in them.
> > The amount of new data is about 5 million rows and each row get about 3 
> > queries (INSERT + UPDATE).
> > Now because I need the old data to be displayed till all of the new data 
> > will be available I'm doing all of the process of deleting the old 
> > content and inserting the new one in one transaction.
> > Should I divide the insertion so ill insert the new data into a 
> > temporary table and the transaction should be commited every 
> > 100,1000,10000 whatever queries? or maybe it doesnt matter to the server 
> > whats the size of the transaction and its ok to handle such a process in 
> > one transaction?
> 
> The only possible issue would be one of capacity, and possibly having a
> lot of dead tuples laying about.
> 
> If you have 5 million rows, and you update every one, then you now have
> 5 million live and 5 million dead tuples in your database.  A Vacuum
> full will take quite a while.
> 
> If you're fsm is set large enough, then as long as you vacuum (regular,
> non full vacuum) between these transactions, then the 5 million dead
> tuples should get reused.  however, the performance of your database
> will for selects and such will be like it was a 10 million row database.
> 
> Given that you NEED to have all 10 million tuples in the database at the
> same time, the use of a temp / holding table would allow you to truncate
> the main table, move everything into the main table, and then drop /
> truncate the temp / holding table.
> 
> If you truncate the main table, then initiate another transaction to
> move the data into it, it shouldn't be so bloated, but the down side is
> you'll have a period of time when it appears empty to users.
> 
> So, the real question is whether or not you can afford to have an empty
> table at some point in the process.
> 
> If you can't, then either method (running the whole transaction against
> the one table or using the temp / holding table) are equivalent.  If you
> can, there should be a noticeable gain from the method of truncating the
> main table outside the update transaction.
> 
> If you need that table to always have the old or new tuples (i.e. never
> be empty) and you can afford the very lengthy vacuum full on the 5
> million dead rows, then that method will give you the best select
> performance the rest of the day.
> 
> ---------------------------(end of broadcast)---------------------------
> TIP 2: Don't 'kill -9' the postmaster
> 

-- 
Jim C. Nasby, Sr. Engineering Consultant      jnasby@xxxxxxxxxxxxx
Pervasive Software      http://pervasive.com    work: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf       cell: 512-569-9461

---------------------------(end of broadcast)---------------------------
TIP 1: if posting/reading through Usenet, please send an appropriate
       subscribe-nomail command to majordomo@xxxxxxxxxxxxxx so that your
       message can get through to the mailing list cleanly