Re: filesystem full during vacuum - space recovery issues

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Thanks Laurenz & Imran for your comments.

My responses inline below.

Thanks

Tom


On 15-Jul-2024 20:58, Laurenz Albe wrote:
On Mon, 2024-07-15 at 14:47 -0400, Thomas Simpson wrote:
I have a large database (multi TB) which had a vacuum full running but the database
ran out of space during the rebuild of one of the large data tables.

Cleaning down the WAL files got the database restarted (an archiving problem led to
the initial disk full).

However, the disk space is still at 99% as it appears the large table rebuild files
are still hanging around using space and have not been deleted.

My problem now is how do I get this space back to return my free space back to where
it should be?

I tried some scripts to map the data files to relations but this didn't work as
removing some files led to startup failure despite them appearing to be unrelated
to anything in the database - I had to put them back and then startup worked.

Any suggestions here?
That reads like the sad old story: "cleaning down" WAL files - you mean deleting the
very files that would have enabled PostgreSQL to recover from the crash that was
caused by the full file system.

Did you run "pg_resetwal"?  If yes, that probably led to data corruption.

No, I just removed the excess already archived WALs to get space and restarted.  The vacuum full that was running had created files for the large table it was processing and these are still hanging around eating space without doing anything useful.  The shutdown prevented the rollback cleanly removing them which seems to be the core problem.

The above are just guesses.  Anyway, there is no good way to get rid of the files
that were left behind after the crash.  The reliable way of doing so is also the way
to get rid of potential data corruption caused by "cleaning down" the database:
pg_dump the whole thing and restore the dump to a new, clean cluster.

Yes, that will be a painfully long down time.  An alternative is to restore a backup
taken before the crash.

My issue now is the dump & reload is taking a huge time; I know the hardware is capable of multi-GB/s throughput but the reload is taking a long time - projected to be about 10 days to reload at the current rate (about 30Mb/sec).  The old server and new server have a 10G link between them and storage is SSD backed, so the hardware is capable of much much more than it is doing now.

Is there a way to improve the reload performance?  Tuning of any type - even if I need to undo it later once the reload is done.

My backups were in progress when all the issues happened, so they're not such a good starting point and I'd actually prefer the clean reload since this DB has been through multiple upgrades (without reloads) until now so I know it's not especially clean.  The size has always prevented the full reload before but the database is relatively low traffic now so I can afford some time to reload, but ideally not 10 days.

Yours,
Laurenz Albe

[Index of Archives]     [Postgresql Home]     [Postgresql General]     [Postgresql Performance]     [Postgresql PHP]     [Postgresql Jobs]     [PHP Users]     [PHP Databases]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Databases]     [Yosemite Forum]

  Powered by Linux