On Wed, 2022-02-09 at 17:29 +0900, Abhishek Bhola wrote: > I have a table with monthly partitions, starting from 2011. > Each partition has about 2-3 billion rows and is about 40TB of data in total. > I am running a select query to \copy the data into csv files for each month. > > I see there is an AUTOVACUUM job started on all the partitions that have been > read (on which select has run). I am assuming the reason why Postgres is > autovacuuming these large partitions is this: > > db=# SELECT c.relname as table_name, c.relkind as type, age(c.relfrozenxid) as age, > c.relfrozenxid FROM pg_class AS c WHERE age(c.relfrozenxid) <> 2147483647 ORDER BY 3 DESC LIMIT 20; > table_name | type | age | relfrozenxid > ---------------------------+------+-----------+-------------- > hist_omx_orderaudit201506 | r | 201463508 | 319891 > hist_omx_orderaudit201509 | r | 201418497 | 364902 > hist_omx_orderaudit201205 | r | 201142042 | 641357 > > autovacuum_freeze_max_age > --------------------------- > 200000000 > (1 row) > > Am I wrong in my understanding why this autovacuum is getting triggered? > > However, my main question is that why do the data files in the tablespace, corresponding > to these partition tables (that have been read or selected recently), or the ones > autovacuumed recently have a latest timestamp? > > There was no insert or update on these partitions, so why do I see an updated timestamp? > Does autovacuum change something on the datafiles? Yes, autovacuum will freeze old rows in the table, which modifies the data files. To keep this to a minimum, you can run VACUUM (FREEZE) on a partition as soon as you know that its data won't get modified any more. Still, I wouldn't be sure that the file modification timestamp won't change any more after that. But that should be pretty irrelevant; at any rate, no substantial data modifications will occur after that. Yours, Laurenz Albe -- Cybertec | https://www.cybertec-postgresql.com