Re: Lot of blocked operations

Jan Schermer <jan@xxxxxxxxxxx> · Fri, 18 Sep 2015 10:59:03 +0200

In that case it can either be slow monitors (slow network, slow disks(!!!)  or a CPU or memory problem).
But it still can also be on the OSD side in the form of either CPU usage or memory pressure - in my case there were lots of memory used for pagecache (so for all intents and purposes considered "free") but when peering the OSD had trouble allocating any memory from it and it caused lots of slow ops and peering hanging in there for a while. This also doesn't show as high CPU usage, only kswapd spins up a bit (don't be fooled by its name, it has nothing to do with swap in this case).

echo 1 >/proc/sys/vm/drop_caches before I touch anything has become a routine now and that problem is gone.

Jan

> On 18 Sep 2015, at 10:53, Olivier Bonvalet <ceph.list@xxxxxxxxx> wrote:
> 
> mmm good point.
> 
> I don't see CPU or IO problem on mons, but in logs, I have this :
> 
> 2015-09-18 01:55:16.921027 7fb951175700  0 log [INF] : pgmap v86359128:
> 6632 pgs: 77 inactive, 1 remapped, 10 active+remapped+wait_backfill, 25
> peering, 5 active+remapped, 6 active+remapped+backfilling, 6499
> active+clean, 9 remapped+peering; 18974 GB data, 69004 GB used, 58578
> GB / 124 TB avail; 915 kB/s rd, 26383 kB/s wr, 1671 op/s; 8417/15680513
> objects degraded (0.054%); 1062 MB/s, 274 objects/s recovering
> 
> 
> So... it can be a peering problem. Didn't see that, thanks.
> 
> 
> 
> Le vendredi 18 septembre 2015 à 09:52 +0200, Jan Schermer a écrit :
>> Could this be caused by monitors? In my case lagging monitors can
>> also cause slow requests (because of slow peering). Not sure if
>> that's expected or not, but it of course doesn't show on the OSDs as
>> any kind of bottleneck when you try to investigate...
>> 
>> Jan
>> 
>>> On 18 Sep 2015, at 09:37, Olivier Bonvalet <ceph.list@xxxxxxxxx>
>>> wrote:
>>> 
>>> Hi,
>>> 
>>> sorry for missing informations. I was to avoid putting too much
>>> inappropriate infos ;)
>>> 
>>> 
>>> 
>>> Le vendredi 18 septembre 2015 à 12:30 +0900, Christian Balzer a
>>> écrit :
>>>> Hello,
>>>> 
>>>> On Fri, 18 Sep 2015 02:43:49 +0200 Olivier Bonvalet wrote:
>>>> 
>>>> The items below help, but be a s specific as possible, from OS,
>>>> kernel
>>>> version to Ceph version, "ceph -s", any other specific details
>>>> (pool
>>>> type,
>>>> replica size).
>>>> 
>>> 
>>> So, all nodes use Debian Wheezy, running on a vanilla 3.14.x
>>> kernel,
>>> and Ceph 0.80.10.
>>> I don't have anymore ceph status right now. But I have
>>> data to move tonight again, so I'll track that.
>>> 
>>> The affected pool is a standard one (no erasure coding), with only
>>> 2 replica (size=2).
>>> 
>>> 
>>> 
>>> 
>>>>> Some additionnal informations :
>>>>> - I have 4 SSD per node.
>>>> Type, if nothing else for anecdotal reasons.
>>> 
>>> I have 7 storage nodes here :
>>> - 3 nodes which have each 12 OSD of 300GB
>>> SSD
>>> - 4 nodes which have each  4 OSD of 800GB SSD
>>> 
>>> And I'm trying to replace 12x300GB nodes by 4x800GB nodes.
>>> 
>>> 
>>> 
>>>>> - the CPU usage is near 0
>>>>> - IO wait is near 0 too
>>>> Including the trouble OSD(s)?
>>> 
>>> Yes
>>> 
>>> 
>>>> Measured how, iostat or atop?
>>> 
>>> iostat, htop, and confirmed with Zabbix supervisor.
>>> 
>>> 
>>> 
>>> 
>>>>> - bandwith usage is also near 0
>>>>> 
>>>> Yeah, all of the above are not surprising if everything is stuck
>>>> waiting
>>>> on some ops to finish. 
>>>> 
>>>> How many nodes are we talking about?
>>> 
>>> 
>>> 7 nodes, 52 OSDs.
>>> 
>>> 
>>> 
>>>>> The whole cluster seems waiting for something... but I don't
>>>>> see
>>>>> what.
>>>>> 
>>>> Is it just one specific OSD (or a set of them) or is that all
>>>> over
>>>> the
>>>> place?
>>> 
>>> A set of them. When I increase the weight of all 4 OSDs of a node,
>>> I
>>> frequently have blocked IO from 1 OSD of this node.
>>> 
>>> 
>>> 
>>>> Does restarting the OSD fix things?
>>> 
>>> Yes. For several minutes.
>>> 
>>> 
>>>> Christian
>>>>> 
>>>>> Le vendredi 18 septembre 2015 à 02:35 +0200, Olivier Bonvalet a
>>>>> écrit :
>>>>>> Hi,
>>>>>> 
>>>>>> I have a cluster with lot of blocked operations each time I
>>>>>> try
>>>>>> to
>>>>>> move
>>>>>> data (by reweighting a little an OSD).
>>>>>> 
>>>>>> It's a full SSD cluster, with 10GbE network.
>>>>>> 
>>>>>> In logs, when I have blocked OSD, on the main OSD I can see
>>>>>> that
>>>>>> :
>>>>>> 2015-09-18 01:55:16.981396 7f89e8cb8700  0 log [WRN] : 2 slow
>>>>>> requests, 1 included below; oldest blocked for > 33.976680
>>>>>> secs
>>>>>> 2015-09-18 01:55:16.981402 7f89e8cb8700  0 log [WRN] : slow
>>>>>> request
>>>>>> 30.125556 seconds old, received at 2015-09-18
>>>>>> 01:54:46.855821:
>>>>>> osd_op(client.29760717.1:18680817544
>>>>>> rb.0.1c16005.238e1f29.00000000027f [write 180224~16384]
>>>>>> 6.c11916a4
>>>>>> snapc 11065=[11065,10fe7,10f69] ondisk+write e845819) v4
>>>>>> currently
>>>>>> reached pg
>>>>>> 2015-09-18 01:55:46.986319 7f89e8cb8700  0 log [WRN] : 2 slow
>>>>>> requests, 1 included below; oldest blocked for > 63.981596
>>>>>> secs
>>>>>> 2015-09-18 01:55:46.986324 7f89e8cb8700  0 log [WRN] : slow
>>>>>> request
>>>>>> 60.130472 seconds old, received at 2015-09-18
>>>>>> 01:54:46.855821:
>>>>>> osd_op(client.29760717.1:18680817544
>>>>>> rb.0.1c16005.238e1f29.00000000027f [write 180224~16384]
>>>>>> 6.c11916a4
>>>>>> snapc 11065=[11065,10fe7,10f69] ondisk+write e845819) v4
>>>>>> currently
>>>>>> reached pg
>>>>>> 
>>>>>> How should I read that ? What this OSD is waiting for ?
>>>>>> 
>>>>>> Thanks for any help,
>>>>>> 
>>>>>> Olivier
>>>>>> _______________________________________________
>>>>>> ceph-users mailing list
>>>>>> ceph-users@xxxxxxxxxxxxxx
>>>>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>>>>> 
>>>>> _______________________________________________
>>>>> ceph-users mailing list
>>>>> ceph-users@xxxxxxxxxxxxxx
>>>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>>> 
>>>> 
>>> _______________________________________________
>>> ceph-users mailing list
>>> ceph-users@xxxxxxxxxxxxxx
>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>> 
>> 

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com