OSD troubles on FS+Tiering

greg@xxxxxxxxxxx (Gregory Farnum) · Mon, 15 Sep 2014 10:37:07 -0700

The pidfile bug is already fixed in master/giant branches.

As for the crashing, I'd try killing all the osd processes and turning
them back on again. It might just be some daemon restart failed, or
your cluster could be sufficiently overloaded that the node disks are
going unresponsive and they're suiciding, or...
-Greg
Software Engineer #42 @ http://inktank.com | http://ceph.com

On Mon, Sep 15, 2014 at 5:43 AM, Kenneth Waegeman
<Kenneth.Waegeman at ugent.be> wrote:
> Hi,
>
> I have some strange OSD problems. Before the weekend I started some rsync
> tests over CephFS, on a cache pool with underlying EC KV pool. Today the
> cluster is completely degraded:
>
> [root at ceph003 ~]# ceph status
>     cluster 82766e04-585b-49a6-a0ac-c13d9ffd0a7d
>      health HEALTH_WARN 19 pgs backfill_toofull; 403 pgs degraded; 168 pgs
> down; 8 pgs incomplete; 168 pgs peering; 61 pgs stale; 403 pgs stuck
> degraded; 176 pgs stuck inactive; 61 pgs stuck stale; 589 pgs stuck unclean;
> 403 pgs stuck undersized; 403 pgs undersized; 300 requests are blocked > 32
> sec; recovery 15170/27902361 objects degraded (0.054%); 1922/27902361
> objects misplaced (0.007%); 1 near full osd(s)
>      monmap e1: 3 mons at
> {ceph001=10.141.8.180:6789/0,ceph002=10.141.8.181:6789/0,ceph003=10.141.8.182:6789/0},
> election epoch 8, quorum 0,1,2 ceph001,ceph002,ceph003
>      mdsmap e5: 1/1/1 up {0=ceph003=up:active}, 2 up:standby
>      osdmap e719: 48 osds: 18 up, 18 in
>       pgmap v144887: 1344 pgs, 4 pools, 4139 GB data, 2624 kobjects
>             2282 GB used, 31397 GB / 33680 GB avail
>             15170/27902361 objects degraded (0.054%); 1922/27902361 objects
> misplaced (0.007%)
>                   68 down+remapped+peering
>                    1 active
>                  754 active+clean
>                    1 stale+incomplete
>                    1 stale+active+clean+scrubbing
>                   14 active+undersized+degraded+remapped
>                    7 incomplete
>                  100 down+peering
>                    9 active+remapped
>                   59 stale+active+undersized+degraded
>                   19 active+undersized+degraded+remapped+backfill_toofull
>                  311 active+undersized+degraded
>
> I tried to figure out what happened in the global logs:
>
> 2014-09-13 08:01:19.433313 mon.0 10.141.8.180:6789/0 66076 : [INF] pgmap
> v65892: 1344 pgs: 1344 active+clean; 2606 GB data, 3116 GB used, 126 TB /
> 129 TB avail; 4159 kB/s wr, 45 op/s
> 2014-09-13 08:01:20.443019 mon.0 10.141.8.180:6789/0 66078 : [INF] pgmap
> v65893: 1344 pgs: 1344
> 2014-09-13 08:01:20.443019 mon.0 10.141.8.180:6789/0 66078 : [INF] pgmap
> v65893: 1344 pgs: 1344 active+clean; 2606 GB data, 3116 GB used, 126 TB /
> 129 TB avail; 561 kB/s wr, 11 op/s
> 2014-09-13 08:01:20.777988 mon.0 10.141.8.180:6789/0 66081 : [INF] osd.19
> 10.141.8.181:6809/29664 failed (3 reports from 3 peers after 20.000079 >=
> grace 20.000000)
> 2014-09-13 08:01:21.455887 mon.0 10.141.8.180:6789/0 66083 : [INF] osdmap
> e117: 48 osds: 47 up, 48 in
> 2014-09-13 08:01:21.462084 mon.0 10.141.8.180:6789/0 66084 : [INF] pgmap
> v65894: 1344 pgs: 1344 active+clean; 2606 GB data, 3116 GB used, 126 TB /
> 129 TB avail; 1353 kB/s wr, 13 op/s
> 2014-09-13 08:01:21.477007 mon.0 10.141.8.180:6789/0 66085 : [INF] pgmap
> v65895: 1344 pgs: 187 stale+active+clean, 1157 active+clean; 2606 GB data,
> 3116 GB used, 126 TB / 129 TB avail; 2300 kB/s wr, 21 op/s
> 2014-09-13 08:01:22.456055 mon.0 10.141.8.180:6789/0 66086 : [INF] osdmap
> e118: 48 osds: 47 up, 48 in
> 2014-09-13 08:01:22.462590 mon.0 10.141.8.180:6789/0 66087 : [INF] pgmap
> v65896: 1344 pgs: 187 stale+active+clean, 1157 active+clean; 2606 GB data,
> 3116 GB used, 126 TB / 129 TB avail; 13686 kB/s wr, 5 op/s
> 2014-09-13 08:01:23.464302 mon.0 10.141.8.180:6789/0 66088 : [INF] pgmap
> v65897: 1344 pgs: 187 stale+active+clean, 1157 active+clean; 2606 GB data,
> 3116 GB used, 126 TB / 129 TB avail; 11075 kB/s wr, 4 op/s
> 2014-09-13 08:01:24.477467 mon.0 10.141.8.180:6789/0 66089 : [INF] pgmap
> v65898: 1344 pgs: 187 stale+active+clean, 1157 active+clean; 2606 GB data,
> 3116 GB used, 126 TB / 129 TB avail; 4932 kB/s wr, 38 op/s
> 2014-09-13 08:01:25.481027 mon.0 10.141.8.180:6789/0 66090 : [INF] pgmap
> v65899: 1344 pgs: 187 stale+active+clean, 1157 active+clean; 2606 GB data,
> 3116 GB used, 126 TB / 129 TB avail; 5726 kB/s wr, 64 op/s
> 2014-09-13 08:01:19.336173 osd.1 10.141.8.180:6803/26712 54442 : [WRN] 1
> slow requests, 1 included below; oldest blocked for > 30.000137 secs
> 2014-09-13 08:01:19.336341 osd.1 10.141.8.180:6803/26712 54443 : [WRN] slow
> request 30.000137 seconds old, received at 2014-09-13 08:00:49.335339:
> osd_op(client.7448.1:17751783 10000203eac.0000000e [write 0~319488
> [1 at -1],startsync 0~0] 1.b
> 6c3a3a9 snapc 1=[] ondisk+write e116) currently reached pg
> 2014-09-13 08:01:20.337602 osd.1 10.141.8.180:6803/26712 54444 : [WRN] 7
> slow requests, 6 included below; oldest blocked for > 31.001947 secs
> 2014-09-13 08:01:20.337688 osd.1 10.141.8.180:6803/26712 54445 : [WRN] slow
> request 30.998110 seconds old, received at 2014-09-13 08:00:49.339176:
> osd_op(client.7448.1:17751787 10000203eac.0000000e [write 319488~65536
> [1 at -1],startsync 0~0]
>
>
> This is happening OSD after OSD..
>
> I tried to check the individual log of the osds, but all the individual logs
> stop abruptly (also from the osds that are still running):
>
> 2014-09-12 14:25:51.205276 7f3517209700  0 log [WRN] : 41 slow requests, 1
> included below; oldest blocked for > 38.118088 secs
> 2014-09-12 14:25:51.205337 7f3517209700  0 log [WRN] : slow request
> 36.558286 seconds old, received at 2014-09-12 14:25:14.646836:
> osd_op(client.7448.1:2458392 1000006328f.0000000b [write 3989504~204800
> [1 at -1],startsync 0~0] 1.9337bf4b snapc 1=[] ondisk+write e116) currently
> reached pg
> 2014-09-12 14:25:53.205586 7f3517209700  0 log [WRN] : 30 slow requests, 1
> included below; oldest blocked for > 40.118530 secs
> 2014-09-12 14:25:53.205679 7f3517209700  0 log [WRN] : slow request
> 30.541026 seconds old, received at 2014-09-12 14:25:22.664538:
> osd_op(client.7448.1:2460291 100000632b7.00000000 [write 0~691
> [1 at -1],startsync 0~0] 1.994248a8 snapc 1=[] ondisk+write e116) currently
> reached pg
> 2014-09-12 17:52:40.503917 7f34e8ed2700  0 -- 10.141.8.181:6809/29664 >>
> 10.141.8.181:6847/62389 pipe(0x247ce040 sd=327 :6809 s=0 pgs=0 cs=0 l=1
> c=0x1bc8b9c0).accept replacing existing (lossy) channel (new one lossy=1)
>
> I *think* the absence of the logs is some issue related to another issue I
> just found (http://tracker.ceph.com/issues/9470).
>
> So I can't found out the original problem with the log files..
>
> Is there any other way I can find out what started the crashing of 30 osds ?
>
> Thanks!!
>
> Kenneth
>
> _______________________________________________
> ceph-users mailing list
> ceph-users at lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com