I have been having some problems with my nvme osds on a pre-production system. We run 1715 osds (of which 35 are nvmes).
We run the buckets.index pool off these nvmes. However I've started seeing slow requests and near full warnings on the nvme osds.
Anyone have any suggestions on how to find the root cause?
Check out this sample of my ceph osd df:
ID CLASS WEIGHT REWEIGHT SIZE USE AVAIL %USE VAR PGS
1470 nvme 0.46599 1.00000 476G 341G 134G 71.70 9.87 51
It seems like way too much metadata. I have been messing around with increasing pgs and modifying crush maps to go from host based to rack based fault tolerance, but I think these bloated omaps where there before I started messing around.
_______________________________________________ Ceph-large mailing list Ceph-large@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-large-ceph.com