I have a postgres database that is doubling in size each year and contains many online orders that were delivered long ago and are now dormant in the database.
My desire is to offload these dormant orders to a secondary Postgres instance, so that the information is still available for viewing purchase history, while keeping the active orders in the main, and eventually smaller Postgres instance.
These orders live in a highly normalized model that spans over 30 tables.
I have thought of two approaches to handle this "archiving" use case.
One approach is pull based. As orders naturally go dormant, execute custom SQL to move this information into the secondary postgres instance. A separate process would then purge this order information from the main database after a period of time (e.g., custom deletes). The upside is that the process is fairly lightweight and straightforward. The downside is you have to maintain two data models and the SQL scripts whenever the data model changes.
Second approach is logical replication. The upside is that you don't have to maintain two data models separately. The downside is that it becomes very tricky to differentiate between a "normal" delete, which you would want to apply on the secondary instance versus an "archive" delete which you wouldn't want to apply on the secondary. It could also get complex when replication fails and you have to re-initiate the stream.
My question is whether anyone else has solved this and if so, how? I'm leaning toward pull-based, because replication doesn't feel right with the complexity around "selective" deletes, and other replication failure scenarios.
Thanks!