Hi,
I simply grepped for "slow request" in ceph.log. What exactly do you
mean by "effective OSD"?
If I have this log line:
2017-01-11 [...] osd.16 [...] cluster [WRN] slow request 32.868141
seconds old, received at 2017-01-11 [...]
ack+ondisk+write+known_if_redirected e12440) currently waiting for
subops from 0,12
I assumed that osd.16 is the one causing problems. But now that you
mention the subops, I only noticed them yesterday, but didn't have the
time yet to investigate further. I'll have a look into the subops
messages and report back.
Thanks!
Eugen
Zitat von Burkhard Linke <Burkhard.Linke@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx>:
Hi,
just for clarity:
Did you parse the slow request messages and use the effective OSD in
the statistics? Some message may refer to other OSDs, e.g. "waiting
for sub op on OSD X,Y". The reporting OSD is not the root cause in
that case, but one of the mentioned OSDs (and I'm currently not
aware of a method to determine which of the both OSD is the cause in
case of 3 replicates.....).
Regards,
Burkhard
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
--
Eugen Block voice : +49-40-559 51 75
NDE Netzdesign und -entwicklung AG fax : +49-40-559 51 77
Postfach 61 03 15
D-22423 Hamburg e-mail : eblock@xxxxxx
Vorsitzende des Aufsichtsrates: Angelika Mozdzen
Sitz und Registergericht: Hamburg, HRB 90934
Vorstand: Jens-U. Mozdzen
USt-IdNr. DE 814 013 983
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com