Re: Very long deletion time on a 200 GB database

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 02/23/2012 02:39 AM, Reuven M. Lerner wrote:

I should note that my primary concern is available RAM. The database, as
I wrote, is about 200 GB in size, and PostgreSQL is reporting (according
to Windows) use of about 5 GB RAM, plus another 25 GB of virtual memory.

O_o

That... that would probably swap just constantly. No end. Just swap all day long. But maybe not. Please tell us the values for these settings:

* shared_buffers
* work_mem
* maintenance_work_mem
* checkpoint_segments
* checkpoint_timeout

It also wouldn't be a bad idea to see how many concurrent connections there are, because that may determine how much memory all the backends are consuming. In any case, if it's actually using 25GB of virtual memory, any command you run that doesn't happen to be in cache, will just immediately join a giant logjam.

I've told the Windows folks on this project that virtual memory kills a
database, and that it shouldn't surprise us to have horrible performance
if the database and operating system are both transferring massive
amounts of data back and forth. But there doesn't seem to be a good way
to handle this

You kinda can, by checking those settings and sanitizing them. If they're out of line, or too large, they'll create the need for more virtual memory. Having the virtual memory there isn't necessarily bad, but using it is.

DELETE FROM B
WHERE r_id IN (SELECT R.id
FROM R, B
WHERE r.end_date < (NOW() - (interval '1 day' * 30))
AND r.id = b.r_id

Just to kinda help you out syntactically, have you ever tried a DELETE FROM ... USING? You can also collapse your interval notation.

DELETE FROM B
 USING R
 WHERE R.id = B.r_id
   AND R.end_date < CURRENT_DATE - INTERVAL '30 days';

But besides that, the other advise you've received is sound. Since your select->truncate->insert attempt was also slow, I suspect you're having problems with foreign key checks, and updating the index trees. Maintaining an existing index can be multiples slower than filling an empty table and creating the indexes afterwards.

So far as your foreign keys, if any of the child tables don't have an index on the referring column, your delete performance will be atrocious. You also need to make sure the types of the columns are identical. Even a numeric/int difference will be enough to render an index unusable.

We have a 100GB *table* with almost 200M rows and even deleting from that in many of our archive tests doesn't take anywhere near 9 hours. But I *have* seen a delete take that long when we had a numeric primary key, and an integer foreign key. Even a handful of records can cause a nested loop sequence scan, which will vastly inflate delete time.

--
Shaun Thomas
OptionsHouse | 141 W. Jackson Blvd. | Suite 500 | Chicago IL, 60604
312-444-8534
sthomas@xxxxxxxxx

______________________________________________

See http://www.peak6.com/email_disclaimer/ for terms and conditions related to this email

--
Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[Postgresql General]     [Postgresql PHP]     [PHP Users]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Yosemite]

  Powered by Linux