Excellent Scott! Thank you! Francesco ________________________________________ Da: Scott Marlowe [scott.marlowe@xxxxxxxxx] Inviato: martedì 21 giugno 2016 2.06 A: Job Cc: pgsql-general@xxxxxxxxxxxxxx Oggetto: Re: Vacuum full: alternatives? On Mon, Jun 20, 2016 at 3:18 AM, Job <Job@xxxxxxxxxxxxxxxxxxxx> wrote: > > Hello, > > we have a table with an heavy traffic of pg_bulkload and delete of records. > The size pass, in only one day, for example for 1Gb to 4Gb and then 1Gb back. > > We have important problems on size and the only way to gain free space is issueing a vacuum full <table>. > But the operation is very slow, sometimes 2/4 hours, and table is not available for services as it is locked. > > We do not delete everything at one (in this case the truncate woudl resolve the problem). > > The autovacuum is not able (same for normal vacuum) to free the spaces. > > Are there some suggestions or another way to manage this? First off, from your comments I'm not sure you really get postgresql's way of freeing space and reusing it via autovacuum. Basically postgresql, autovacuum process marks space as free, and the backend writes new data (inserts or updates) into the free space. You eventually reach equilibrium of a sort when the vacuum is freeing up space as quickly as it's being consumed, or faster. The problem occurs when vacuum can't keep up with your delete / write and update rate combined. If this is happening you need to: A: Make sure your IO Subsystem is fast enough to handle BOTH your update rate AND your vacuuming needed to keep up, You're better off with a machine that can do 15,000 transactions per second running a load of 1,000 than trying to handle it with a machine that can do 1,500 tps etc. Sizing the hardware is a whole other conversation. AND B: Make your autovacuum aggressive enough to NOT fall behind. It's important to remember that autovacuum was built and designed in a time when most databases lived on spinning media. It's designed to not overload spinning discs with too much random IO. A super fast RAID-10 array from that time period could do 200 to 1,000 transactions per second and that only with a top notch RAID controller etc. Regular spinning discs have a maximum random write ops per second that measure in the 100 per second range. My laptop with no pgsql optimizations, can do 850 tps on it's SSD. A server with 10 SSDs in RAID-5 can do 15,000 tps. If you have a fast IO subsystem and wish to utilize it with pgsql you're going to have to examine whether or not autovacuum with default settings is fast enough to keep up. Remmeber, Auto-vacuum, by design, is slow and tries not to get in the way. It's fast enough for most mundane uses, but can't keep up with a fast machine running hard. The default settings for autovacuum to look at here are first these two.: autovacuum_vacuum_cost_delay = 20ms autovacuum_vacuum_cost_limit = 200 They govern how hard autovac works. By default autovac doesn't work hard. Making it work too hard for a given machine can cause system performance issues. I.e. it "gets in the way". Lowering cost_delay is usually enough. As you approach 1ms autovac starts to use a lot more bandwidth. I find that even on pretty fast machines that are routinely doing 1,000 writes per second or more, 3ms is fast enough to keep up with a cost limit of 200. 5ms is a good compromise without getting too aggressive. In contrast to autovacuum, REGULAR vacuum, by default, runs at full throttle. It hits your db server hard, performance-wise. It has zero cost delay, so it works very hard. If you run it midday on a hard working server you will almost certainly see the performance drop. The difference between regular vacuum with a delay time of 0 and autovac with a delay of 20ms is huge. These settings become important if you have a LOT of tables or dbs. Otherwise they're probably fine. autovacuum_max_workers =3 # Adjust this last, unless you have thousands of tables or dbs. autovacuum_naptime = 1 min # How long to wait before checking the next db. Default is usually fine unless you have a lot of dbs. These settings tell autovacuum when to kick in. Keeping these low enough to keep autovac busy is a good idea too: autovacuum_vacuum_scale_factor autovacuum_vacuum_threshold autovacuum_analyze_scale_factor autovacuum_analyze_threshold I tend to go for threshold, which is an absolute number of rows changed before autovac kicks off. Scale factor can be dangerous because what seems small at the beginning, gets big fast. If it's 0.1 then that's 10%. 10% of 1,000 is 100 rows, but 10% of 1,000,000 is 100,000, which is a LOT of rows to ignore until you have more than that that need vacuuming. Setting it to something like 100 or 1,000 will keep your db from growing hundreds of thousands ofr dead tuples in a big table. Either way you need to make sure your autovacuum is aggressive enough to keep up with your db's throughput. Checking for bloat. You can see what parts of your db are getting too big. First, go here: https://www.keithf4.com/checking-for-postgresql-bloat/ The script there will let you check all your tables AND indexes for bloat. This will let you know if you've got a simple space problem or a vacuuming problem. Assuming you DO have bloating, one of the first things you can do is rebuild all the indexes except for PK ones with new indexes then drop the old ones. On some tables this saves a LOT of space. Since you can "create index concurrently ... " this is a non-blocking operation. You can get a list of indexes for a table: select indexdef from pg_indexes where tablename='pgbench_accounts'; Then build "create index concurrently ..." statements for each one. Assuming the table is bloated and you HAVE to recover space for normal operation (let's say you've got 80MB of data in a 200GB table etc)... I'm gonna head in a different direction here. Slony. Slony can subscribe your table, or your whole db, up to you, to a new one, either on a different machine or on the same machine. It's actually pretty easy to set it up and subscribe one table, get it caught up, drop access to db, swap tables, and bring access to the db back up. The downtime is measured in seconds. If one table one time is all you need that's fine but you can also use it to setup a replica of the bloated machine, sans bloat, and switchover the whole db operation to another machine / db. -- Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general