Recovery of OMAP keys

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

Our cluster has an SSD pool that contains empty objects with about 100k OMAP
keys each (similar to the rgw index pool).

If we restart one of the associated SSD OSDs while writing just a few OMAP keys
to the cluster, I've noticed that PGs take a very long time to recover, and
`ceph status` shows 200k+ keys/s being recovered, despite only maybe a couple
thousands new keys having been created.

```
recovery: 0 B/s, 268.65k keys/s, 2 objects/s
```

What seems to be happening (but I'd love confirmation on that from a developer)
is that any PG that was "tainted" while the OSD was restarting get marked for
recovery, and then instead of just adding the missing keys, existing keys are
deleted and recreated. What makes me think that a large number of keys are
being deleted is that we're affected by https://tracker.ceph.com/issues/55324
(we're still running 16.2.7), and after the recovery finishes, we do see slow
ops caused by tombstones, and the only way to fix it is to compact the OSD.

Can someone confirm that it's really what's happening? Is this the expected
behavior, or is there a way to make OMAP recovery more efficient?

Cheers,

--
Ben

_______________________________________________
Dev mailing list -- dev@xxxxxxx
To unsubscribe send an email to dev-leave@xxxxxxx



[Index of Archives]     [CEPH Users]     [Ceph Devel]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux