Re: monitor sst files continue growing

Zhenshi Zhou <deaderzzs@xxxxxxxxx> · Fri, 30 Oct 2020 02:29:48 +0800

Hi Alex,

We found that there were a huge number of keys in the "logm" and "osdmap"
table
while using ceph-monstore-tool. I think that could be the root cause.

Well, some pages also say that disable 'insight' module can resolve this
issue, but
I checked our cluster and we didn't enable this module. check this page
<https://tracker.ceph.com/issues/39955>.

Anyway, our cluster is unhealthy though, it just need time keep recovering
data :)

Thanks

Alex Gracie <alexandergracie17@xxxxxxxxx> 于2020年10月29日周四 下午10:57写道：

> We hit this issue over the weekend on our HDD backed EC Nautilus cluster
> while removing a single OSD. We also did not have any luck using
> compaction. The mon-logs filled up our entire root disk on the mon servers
> and we were running on a single monitor for hours while we tried to finish
> recovery and reclaim space. The past couple weeks we also noticed "pg not
> scubbed in time" errors but are unsure if they are related. I'm still the
> exact cause of this(other than the general misplaced/degraded objects) and
> what kind of growth is acceptable for these store.db files.
>
> In order to get our downed mons restarted, we ended up backing up and
> coping the /var/lib/ceph/mon/* contents to a remote host, setting up an
> sshfs mount to that new host with large NVME and SSDs, ensuring the mount
> paths were owned by ceph, then clearing up enough space on the monitor host
> to start the service. This allowed our store.db directory to grow freely
> until the misplaced/degraded objects could recover and monitors all
> rejoined eventually.
> _______________________________________________
> ceph-users mailing list -- ceph-users@xxxxxxx
> To unsubscribe send an email to ceph-users-leave@xxxxxxx
>
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx