Hello All,
I have my self hosted postgres server on aws with 16gb disk space attached to it for ml stuff and analysis stuff we are using vertex ai so i have setup live replication of postgres using data stream service to BigQuery table. We use BigQuery table as data warehouse because we have so many different data source so our data analysis and ml can happened at one place.
but problem is there When i am starting replication in there pg_wal take whole space about 15.8gb in some days of starting replication
Question : how can i setup something like that that optimally use disk space so old pg_wal data that are not usable can we delete i think i should create one cron job which taken care whole that things but i don't know any approach can you please guide
In future if as data grew i will attached more disk space to that instance but i want to make optimal setup so my whole disk is not in full usage any time and my server crash again.
I have my self hosted postgres server on aws with 16gb disk space attached to it for ml stuff and analysis stuff we are using vertex ai so i have setup live replication of postgres using data stream service to BigQuery table. We use BigQuery table as data warehouse because we have so many different data source so our data analysis and ml can happened at one place.
but problem is there When i am starting replication in there pg_wal take whole space about 15.8gb in some days of starting replication
Question : how can i setup something like that that optimally use disk space so old pg_wal data that are not usable can we delete i think i should create one cron job which taken care whole that things but i don't know any approach can you please guide
In future if as data grew i will attached more disk space to that instance but i want to make optimal setup so my whole disk is not in full usage any time and my server crash again.