Some additionnal informations : - I have 4 SSD per node. - the CPU usage is near 0 - IO wait is near 0 too - bandwith usage is also near 0 The whole cluster seems waiting for something... but I don't see what. Le vendredi 18 septembre 2015 à 02:35 +0200, Olivier Bonvalet a écrit : > Hi, > > I have a cluster with lot of blocked operations each time I try to > move > data (by reweighting a little an OSD). > > It's a full SSD cluster, with 10GbE network. > > In logs, when I have blocked OSD, on the main OSD I can see that : > 2015-09-18 01:55:16.981396 7f89e8cb8700 0 log [WRN] : 2 slow > requests, 1 included below; oldest blocked for > 33.976680 secs > 2015-09-18 01:55:16.981402 7f89e8cb8700 0 log [WRN] : slow request > 30.125556 seconds old, received at 2015-09-18 01:54:46.855821: > osd_op(client.29760717.1:18680817544 > rb.0.1c16005.238e1f29.00000000027f [write 180224~16384] 6.c11916a4 > snapc 11065=[11065,10fe7,10f69] ondisk+write e845819) v4 currently > reached pg > 2015-09-18 01:55:46.986319 7f89e8cb8700 0 log [WRN] : 2 slow > requests, 1 included below; oldest blocked for > 63.981596 secs > 2015-09-18 01:55:46.986324 7f89e8cb8700 0 log [WRN] : slow request > 60.130472 seconds old, received at 2015-09-18 01:54:46.855821: > osd_op(client.29760717.1:18680817544 > rb.0.1c16005.238e1f29.00000000027f [write 180224~16384] 6.c11916a4 > snapc 11065=[11065,10fe7,10f69] ondisk+write e845819) v4 currently > reached pg > > How should I read that ? What this OSD is waiting for ? > > Thanks for any help, > > Olivier > _______________________________________________ > ceph-users mailing list > ceph-users@xxxxxxxxxxxxxx > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com