On Fri, Apr 5, 2024 at 3:14 PM srinivasan s <srinioracledba7@xxxxxxxxx> wrote:
Dear all,I hope this message finds you well.I would greatly appreciate your input and suggestions regarding the removal of data from our largest table, which currently exceeds 23TB and contains several years worth of data. Our objective is to retain only the data from the past two years in this table. It is important to note that this table is not partitioned.
Considering the critical nature of our business database, it is imperative that the purging process does not adversely affect the ongoing production performance. Therefore, I am seeking the best approach to accomplish this task without compromising system performance.One potential solution I am considering is to create a new table and gradually transfer the data from the large table to the new one in smaller batches. Once the data transfer is complete, we can then rename the tables accordingly and perform a final cutover. However, I am uncertain if this approach will yield the desired results, and I believe it is necessary to conduct thorough testing before proceeding.
Is the table ever updated? An INSERT-only table would make such a scheme much easier to implement.
If any of you have suggestions or ideas on how to efficiently purge data older than two years from such a massive table without impacting production performance, I would greatly appreciate your insights.
1. What PG version are you running?
2. How many indices?
3. Does an index on the relevant date field exist? That also would make such a scheme much easier to implement.
4. How much down time can you get?
4. If nothing else, chip away at the old data, a few "table days" every calendar day". Naturally, a supporting index is required, and vacuum+analyze required after a round of deletes.