On Thu, 12 Jan 2017, Chris Jones (BLOOMBERG/ 120 PARK) wrote: > Hey, > Our monitoring checks the HEALTH_OK/WARN to trigger alerts which are almost > always 'X requests are blocked > 32 sec'. This is multi-petabyte cluster of > only rgw using spinners. Out of curiousity, which version is this? There was a bug in early jewel that could cause scrub to block the OSD thread for extended periods (and trigger these sorts of warnings). It migght be worth checking the ceph.log to see if it is the same OSD(s) that are hitting the slow requests. s > What do you guys find to be the best way to > trigger/monitor? What criteria etc? BTW, most of the 'X' requests are > usually only '1'. Thanks in advance. > > -Chris > > _______________________________________________ Ceph-large mailing list Ceph-large@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-large-ceph.com