On Wed, Oct 25, 2017 at 11:29 AM, kefu chai <tchaikov@xxxxxxxxx> wrote: > hi John and Sage, > > as you know, i am working on [1]. but slow-requests alert are pretty > much a list of strings, in which the first one is a summary, and the > following ones are the details: like > > - 1 slow requests, 1 included below; oldest blocked for > 30.005692 secs > - slow request 30.005692 seconds old, received at {date-time}: > osd_op(client.4240.0:8 benchmark_data_ceph-1_39426_object7 [write > 0~4194304] 0.69848840) v4 currently waiting for subops from [610] > > this fits well into a health_check_t struct. and we can add a field in > MMgrReport, and send it to mgr periodically. but at the mgr side, it > is supposed to compose a single std::map<string, health_check_t> in > MMonMgrReport, and send it to monitor. > > if we put all slow requests from all osds into this map with the key > like "OSD_SLOW_OPS/${osd_id}". the monstore will be loaded by a slow > cluster, and the "health" section of "ceph status" will be flooded > with the slow requests. or we can just collect all the slow request > details into a single bucket of "OSD_SLOW_OPS". The original MDS health items go into a separate store (MDS_HEALTHPREFIX in MDSMonitor.cc), with a separate structure for each MDS. However, since the new encode_health stuff in Luminous, we're also writing all of those to one data structure in MDSMonitor::encode_health. So I guess we have exactly the same issue there as we would for multiple OSD_SLOW_OPS/${osd_id} buckets. This is perhaps an unacceptable load on the mon in any case, as those OSD detail messages will keep changing and we'll end up writing O(N_osds)-sized health objects continuously. We probably need to make sure that the *persisted* part only contains the slowly-changing summary (the boolean of whether each OSD has slow ops), and then have the detail of it be only an in-memory somehow. Would it be terrible to just expect the user to go do a "ceph tell osd.<id> ..." command to find out about the detail of slow requests? We could also retain the existing OSD slow request log messages (at DEBUG severity) so that it is possible for them to find out some information retroactively too. John > but if we just send the summaries from OSDs as the > "health_check_t::detail" with the alert code of "OSD_SLOW_OPS". all > the details are practically stripped off. and the total *number* of > slow requests can be found nowhere unless the user parses the summary > lines, and sum it up manually. > > we could refactor the OpTracker::check_ops_in_flight() so it returns > an array of info describing slow requests instead of a list of > human-readable strings. but we still need to face this problem of > level-of-details. > > any thoughts? > > > --- > https://trello.com/c/8f9y0YM6/51-osd-stateful-health-warnings-to-mgr-mon-eg-slow-requests > > > -- > Regards > Kefu Chai -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html