Re: requests are blocked

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



The hanging kernel tasks under -327 for XFS resulted in LOG verification failures and completely locked the hosts.
BTRFS task timeouts we could get around by setting kernel.hung_task_timeout_secs = 960

The host would eventually get responsive again however that doesn't really matter, since the ceph ops are blocked for so long it all goes to hell anyways.
I only found stability under high load with EXT4 or -229 with BTRFS|EXT4. 

Bad story, sorry to have to tell it.

-Wade


On Tue, Dec 22, 2015 at 9:44 AM Dan Nica <dan.nica@xxxxxxxxxxxxxxxxxxxx> wrote:

That is strange, maybe there is a sysctl option to tweak on OSDs ? this will be nasty if it goes into our production!

 

--

Dan

 

From: Wade Holler [mailto:wade.holler@xxxxxxxxx]
Sent: Tuesday, December 22, 2015 4:36 PM
To: Dan Nica <dan.nica@xxxxxxxxxxxxxxxxxxxx>; ceph-users@xxxxxxxxxxxxxx
Subject: Re: requests are blocked

 

I had major host stability problems under load with -327  . Repeatable test cases under high load with XFS or BTRFS would result in hung kernel tasks and of course the sympathetic behavior you mention. 

requests are blocked mean that the op tracker in ceph hasn't received a timely response from the osd usually.  I'm sure someone more seasoned can provide a better explanation.

-Wade

 

On Tue, Dec 22, 2015 at 9:24 AM Dan Nica <dan.nica@xxxxxxxxxxxxxxxxxxxx> wrote:

Hi

 

I try to run a bench test on a RBD image and I get from time to time the following in ceph status

 

    cluster 046b0180-dc3f-4846-924f-41d9729d48c8

     health HEALTH_WARN

            2 requests are blocked > 32 sec

     monmap e1: 3 mons at {alder=10.6.250.249:6789/0,ash=10.6.250.248:6789/0,aspen=10.6.250.247:6789/0}

            election epoch 18, quorum 0,1,2 aspen,ash,alder

     osdmap e114: 6 osds: 6 up, 6 in

            flags sortbitwise

      pgmap v3816: 192 pgs, 1 pools, 23062 MB data, 5814 objects

            46406 MB used, 44624 GB / 44670 GB avail

                 192 active+clean

  client io 6083 B/s rd, 18884 kB/s wr, 75 op/s

 

 

what does  “requests are blocked” mean ? and performance drops to almost  0 ?

I am running infernalis version on Centos 7 kernel 3.10.0-327.3.1.el7.x86_64

 

Thanks

--

Dan

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux