Hello,
about four weeks ago I upgraded my 14.2.16 cluster (144 4TB hdd-OSDs, 9
hosts) from 14.2.16 to 14.2.22. The upgrade did not cause any trouble.
The cluster is healthy. One thing is however new since the upgrade and
somewhat irritating:
Each weekend in the night from sat to sun I now see health warnings
about slow ops of some osds that I did never see before running 14.2.16.
The mentioned slow osd are not always the same and I did not find any
hints in the smart values or logs that indicate a failing disk.
In this list I recently saw several other posts no matter if Nautilus or
Octopus reporting the very same issue.
Is there a way to get around the slow ops warning, or is it a bug? Can I
check if ceph really succeeds trimming removed snapshots or perhaps
quits trimming because of the slow ops?
In "ceph osd pool health detail" I see a list for one pool that has
about 30 snapshots created and also 30 snapshots deleted each week that
now has 65 removed snaps entries shown as [1~6a,6c~30,9d~2d,cc~a, ...]
in the output. Can I assume that trimming works if this
[1~6a,6c~30,9d~2d,cc~a, ...] list does not get longer each week? Is
there another way to check if trimming works?
Thanks for hints
Rainer
--
Rainer Krienke, Uni Koblenz, Rechenzentrum, A22, Universitaetsstrasse 1
56070 Koblenz, Web: http://www.uni-koblenz.de/~krienke, Tel: +49261287 1312
PGP: http://www.uni-koblenz.de/~krienke/mypgp.html, Fax: +49261287
1001312
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx