Hi guys,
A few days after running a prod environment with a Multisite setup (2 clusters replicating) I could see a really huge memory usage on the radosgw in the secondary cluster, in 20 minutes We were running our of memory in a 32 GB bare metal server.
Investigating a little bit more, I could see the Primary cluster were aways "data is behind on 1 shards" and this was strange because there were no data coming from the secondary cluster at that point, so I decided to move "us-west-1.rgw.log" to "us-west-1.rgw.log.old", recreate "us-west-1.rgw.log" and restart radosgw on the secondary.
After this procedure, the Primary clusters become "data is caught up with source" and the radosgw memory usage of the secondary cluster back to normal.
Do You guys have any ideia what was the issue here ? Did I took the right decision renaming the pool "us-west-1.rgw.log" ?
ps:
running 10.2.6
Best Regards,
Daniel
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com