Re: Slow requests: waiting_for_osdmap

Wido den Hollander <wido@xxxxxxxx> · Mon, 22 Dec 2014 19:30:07 +0100

On 12/22/2014 07:04 PM, Gregory Farnum wrote:
> On Mon, Dec 22, 2014 at 8:20 AM, Wido den Hollander <wido@xxxxxxxx> wrote:
>> Hi,
>>
>> While investigating slow requests on a Firefly (0.80.7) I looked at the
>> historic ops from the admin socket.
>>
>> On a OSD which just spitted out some slow requests I noticed:
>>
>>           "received_at": "2014-12-22 17:08:41.496391",
>>           "age": "9.948475",
>>           "duration": "5.915489"
>>
>>           { "time": "2014-12-22 17:08:41.496687",
>>             "event": "waiting_for_osdmap"},
>>           { "time": "2014-12-22 17:08:46.216946",
>>             "event": "reached_pg"},
>>
>> It spend 5 seconds at "waitinf_for_osdmap"
>>
>> Another request:
>>
>>           "received_at": "2014-12-22 17:08:41.499092",
>>           "age": "9.945774",
>>           "duration": "9.851261",
>>
>>         { "time": "2014-12-22 17:08:41.499322",
>>           "event": "waiting_for_osdmap"},
>>         { "time": "2014-12-22 17:08:51.349938",
>>           "event": "reached_pg"}
>>
>> How should I see this? What is the OSD actually doing?
>>
>> In this case it is a RBD workload with all clients running with 0.80.5
>> librados.
>>
>> The mons are in quorum and time is in sync and there are no osdmap
>> changes happing at this moment.
>>
>> A earlier thread [0] suggested that it might also be a PG issue where
>> requests are serialized.
>>
>> I do at some occasions see disks spiking to 100% busy for some time, but
>> I just want to understand the waiting_for_osdmap better to fully
>> understand what is happening there.
> 
> What message types are these? The waiting_for_osdmap state is supposed
> to cover only that, but there might be some overlooked blocking points
> or something.

For example, two ops:

#1:

{ "description": "osd_sub_op(client.2433432.0:61603164 20.424
19038c24\/rbd_data.d7c912ae8944a.00000000000008b6\/head\/\/20 [] v
63283'8301089 snapset=0=[]:[] snapc=0=[])",
          "received_at": "2014-12-22 19:26:37.458680",
          "age": "2.719850",
          "duration": "2.520937",
          "type_data": [
                "commit sent; apply or cleanup",
                [
                    { "time": "2014-12-22 19:26:37.458914",
                      "event": "waiting_for_osdmap"},
                    { "time": "2014-12-22 19:26:39.310569",
                      "event": "reached_pg"},
                    { "time": "2014-12-22 19:26:39.310728",
                      "event": "started"},
                    { "time": "2014-12-22 19:26:39.310951",
                      "event": "started"},
                    { "time": "2014-12-22 19:26:39.979292",
                      "event": "commit_queued_for_journal_write"},
                    { "time": "2014-12-22 19:26:39.979348",
                      "event": "write_thread_in_journal_buffer"},
                    { "time": "2014-12-22 19:26:39.979594",
                      "event": "journaled_completion_queued"},
                    { "time": "2014-12-22 19:26:39.979617",
                      "event": "commit_sent"}]]},

#2:

{ "description": "osd_sub_op(client.2188703.0:10420738 20.641
6673ee41\/rbd_data.9497e32794ff7.0000000000000454\/head\/\/20 [] v
63283'5215076 snapset=0=[]:[] snapc=0=[])",
          "received_at": "2014-12-22 19:26:38.040551",
          "age": "2.137979",
          "duration": "1.537128",
          "type_data": [
                "started",
                [
                    { "time": "2014-12-22 19:26:38.040717",
                      "event": "waiting_for_osdmap"},
                    { "time": "2014-12-22 19:26:39.577609",
                      "event": "reached_pg"},
                    { "time": "2014-12-22 19:26:39.577624",
                      "event": "started"},
                    { "time": "2014-12-22 19:26:39.577679",
                      "event": "started"}]]},

Can this be something which has to do with the amount of RBD snapshots?
Since I see snapc involved in both ops?

This cluster has about 4.000 RBD images and about 1.000 of them have a
snapshot.

Total number of OSDs in this case is 174.

> -Greg
> 

-- 
Wido den Hollander
42on B.V.
Ceph trainer and consultant

Phone: +31 (0)20 700 9902
Skype: contact42on
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com