[Ceph-community] Pgs are in stale+down+peering state

Sahana.Lokeshappa@xxxxxxxxxxx (Sahana Lokeshappa) · Thu, 25 Sep 2014 08:18:40 +0000

Hi All,

Here are the steps I followed, to get back all pgs to active+clean state. Still don't know what is the root cause for this pg state.

1. Force create pgs which are in stale+down+peering
2. Stop osd.12
3. Mark osd.12 as lost
4. Start osd.12
5. All pgs were back to active+clean state

Thanks
Sahana Lokeshappa
Test Development Engineer I
SanDisk Corporation
3rd Floor, Bagmane Laurel, Bagmane Tech Park
C V Raman nagar, Bangalore 560093
T: +918042422283 
Sahana.Lokeshappa at SanDisk.com

-----Original Message-----
From: ceph-users [mailto:ceph-users-bounces@xxxxxxxxxxxxxx] On Behalf Of Sahana Lokeshappa
Sent: Thursday, September 25, 2014 1:26 PM
To: Sage Weil
Cc: ceph-users at ceph.com
Subject: Re: [Ceph-community] Pgs are in stale+down+peering state

Replies Inline :

Sahana Lokeshappa
Test Development Engineer I
SanDisk Corporation
3rd Floor, Bagmane Laurel, Bagmane Tech Park C V Raman nagar, Bangalore 560093
T: +918042422283
Sahana.Lokeshappa at SanDisk.com

-----Original Message-----
From: Sage Weil [mailto:sweil@xxxxxxxxxx]
Sent: Wednesday, September 24, 2014 6:10 PM
To: Sahana Lokeshappa
Cc: Varada Kari; ceph-users at ceph.com
Subject: RE: [Ceph-community] Pgs are in stale+down+peering state

On Wed, 24 Sep 2014, Sahana Lokeshappa wrote:
> 2.a9    518     0       0       0       0       2172649472      3001
> 3001    active+clean    2014-09-22 17:49:35.357586      6826'35762
> 17842:72706     [12,7,28]       12      [12,7,28]   12
> 6826'35762
> 2014-09-22 11:33:55.985449      0'0     2014-09-16 20:11:32.693864

Can you verify that 2.a9 exists in teh data directory for 12, 7, and/or 28?  If so the next step would be to enable logging (debug osd = 20, debug ms = 1) and see wy peering is stuck...

Yes 2.a9 directories are present in osd.12, 7 ,28

and 0.49 0.4d and 0.1c directories are not present in respective acting osds.

Here are the logs I can see when debugs were raised to 20

2014-09-24 18:38:41.706566 7f92e2dc8700  7 osd.12 pg_epoch: 17850 pg[2.738( v 6870'28894 (4076'25093,6870'28894] local-les=17723 n=537 ec=188 les/c 17723/17725 17722/17722/17709) [57,12,48] r=1 lpr=17722 pi=17199-17721/6 luod=0'0 crt=0'0 lcod 0'0 active] replica_scrub
2014-09-24 18:38:41.706586 7f92e2dc8700 10 osd.12 pg_epoch: 17850 pg[2.738( v 6870'28894 (4076'25093,6870'28894] local-les=17723 n=537 ec=188 les/c 17723/17725 17722/17722/17709) [57,12,48] r=1 lpr=17722 pi=17199-17721/6 luod=0'0 crt=0'0 lcod 0'0 active] build_scrub_map
2014-09-24 18:38:41.706592 7f92e2dc8700 20 osd.12 pg_epoch: 17850 pg[2.738( v 6870'28894 (4076'25093,6870'28894] local-les=17723 n=537 ec=188 les/c 17723/17725 17722/17722/17709) [57,12,48] r=1 lpr=17722 pi=17199-17721/6 luod=0'0 crt=0'0 lcod 0'0 active] scrub_map_chunk [476de738//0//-1,f38//0//-1)
2014-09-24 18:38:41.711778 7f92e2dc8700 10 osd.12 pg_epoch: 17850 pg[2.738( v 6870'28894 (4076'25093,6870'28894] local-les=17723 n=537 ec=188 les/c 17723/17725 17722/17722/17709) [57,12,48] r=1 lpr=17722 pi=17199-17721/6 luod=0'0 crt=0'0 lcod 0'0 active] _scan_list scanning 23 objects deeply
2014-09-24 18:38:41.730881 7f92ed5dd700 20 osd.12 17850 share_map_peer 0x89cda20 already has epoch 17850
2014-09-24 18:38:41.731111 7f92eede0700 20 osd.12 17850 share_map_peer 0x89cda20 already has epoch 17850
2014-09-24 18:38:41.822444 7f92ed5dd700 20 osd.12 17850 share_map_peer 0xd2eb080 already has epoch 17850
2014-09-24 18:38:41.822519 7f92eede0700 20 osd.12 17850 share_map_peer 0xd2eb080 already has epoch 17850
2014-09-24 18:38:41.878894 7f92eede0700 20 osd.12 17850 share_map_peer 0xd5cd5a0 already has epoch 17850
2014-09-24 18:38:41.878921 7f92ed5dd700 20 osd.12 17850 share_map_peer 0xd5cd5a0 already has epoch 17850
2014-09-24 18:38:41.918307 7f92ed5dd700 20 osd.12 17850 share_map_peer 0x1161bde0 already has epoch 17850
2014-09-24 18:38:41.918426 7f92eede0700 20 osd.12 17850 share_map_peer 0x1161bde0 already has epoch 17850
2014-09-24 18:38:41.951678 7f92ed5dd700 20 osd.12 17850 share_map_peer 0x7fc5700 already has epoch 17850
2014-09-24 18:38:41.951709 7f92eede0700 20 osd.12 17850 share_map_peer 0x7fc5700 already has epoch 17850
2014-09-24 18:38:42.064759 7f92e2dc8700 10 osd.12 pg_epoch: 17850 pg[2.738( v 6870'28894 (4076'25093,6870'28894] local-les=17723 n=537 ec=188 les/c 17723/17725 17722/17722/17709) [57,12,48] r=1 lpr=17722 pi=17199-17721/6 luod=0'0 crt=0'0 lcod 0'0 active] build_scrub_map_chunk done.
2014-09-24 18:38:42.107016 7f92ed5dd700 20 osd.12 17850 share_map_peer 0x10377b80 already has epoch 17850
2014-09-24 18:38:42.107032 7f92eede0700 20 osd.12 17850 share_map_peer 0x10377b80 already has epoch 17850
2014-09-24 18:38:42.109356 7f92f15e5700 10 osd.12 17850 do_waiters -- start
2014-09-24 18:38:42.109372 7f92f15e5700 10 osd.12 17850 do_waiters -- finish
2014-09-24 18:38:42.109373 7f92f15e5700 20 osd.12 17850 _dispatch 0xeb0d900 replica scrub(pg: 2.738,from:0'0,to:6489'28646,epoch:17850,start:f38//0//-1,end:92371f38//0//-1,chunky:1,deep:1,version:5) v5
2014-09-24 18:38:42.109378 7f92f15e5700 10 osd.12 17850 queueing MOSDRepScrub replica scrub(pg: 2.738,from:0'0,to:6489'28646,epoch:17850,start:f38//0//-1,end:92371f38//0//-1,chunky:1,deep:1,version:5) v5
2014-09-24 18:38:42.109395 7f92f15e5700 10 osd.12 17850 do_waiters -- start
2014-09-24 18:38:42.109396 7f92f15e5700 10 osd.12 17850 do_waiters -- finish
2014-09-24 18:38:42.109456 7f92e2dc8700  7 osd.12 pg_epoch: 17850 pg[2.738( v 6870'28894 (4076'25093,6870'28894] local-les=17723 n=537 ec=188 les/c 17723/17725 17722/17722/17709) [57,12,48] r=1 lpr=17722 pi=17199-17721/6 luod=0'0 crt=0'0 lcod 0'0 active] replica_scrub
2014-09-24 18:38:42.109522 7f92e2dc8700 10 osd.12 pg_epoch: 17850 pg[2.738( v 6870'28894 (4076'25093,6870'28894] local-les=17723 n=537 ec=188 les/c 17723/17725 17722/17722/17709) [57,12,48] r=1 lpr=17722 pi=17199-17721/6 luod=0'0 crt=0'0 lcod 0'0 active] build_scrub_map
2014-09-24 18:38:42.109529 7f92e2dc8700 20 osd.12 pg_epoch: 17850 pg[2.738( v 6870'28894 (4076'25093,6870'28894] local-les=17723 n=537 ec=188 les/c 17723/17725 17722/17722/17709) [57,12,48] r=1 lpr=17722 pi=17199-17721/6 luod=0'0 crt=0'0 lcod 0'0 active] scrub_map_chunk [f38//0//-1,92371f38//0//-1)
2014-09-24 18:38:42.112545 7f92e2dc8700 10 osd.12 pg_epoch: 17850 pg[2.738( v 6870'28894 (4076'25093,6870'28894] local-les=17723 n=537 ec=188 les/c 17723/17725 17722/17722/17709) [57,12,48] r=1 lpr=17722 pi=17199-17721/6 luod=0'0 crt=0'0 lcod 0'0 active] _scan_list scanning 25 objects deeply
2014-09-24 18:38:42.130581 7f92ed5dd700 20 osd.12 17850 share_map_peer 0x1161a3c0 already has epoch 17850
2014-09-24 18:38:42.130613 7f92eede0700 20 osd.12 17850 share_map_peer 0x1161a3c0 already has epoch 17850
2014-09-24 18:38:42.245173 7f92eede0700 20 osd.12 17850 share_map_peer 0xd0c8840 already has epoch 17850
2014-09-24 18:38:42.245227 7f92ed5dd700 20 osd.12 17850 share_map_peer 0xd0c8840 already has epoch 17850
2014-09-24 18:38:42.325068 7f92ed5dd700 20 osd.12 17850 share_map_peer 0xd2eb080 already has epoch 17850
2014-09-24 18:38:42.325106 7f92eede0700 20 osd.12 17850 share_map_peer 0xd2eb080 already has epoch 17850
2014-09-24 18:38:42.423516 7f92ed5dd700 20 osd.12 17850 share_map_peer 0x89cdb80 already has epoch 17850
2014-09-24 18:38:42.423546 7f92eede0700 20 osd.12 17850 share_map_peer 0x89cdb80 already has epoch 17850
2014-09-24 18:38:42.487722 7f92ed5dd700 20 osd.12 17850 share_map_peer 0xd0caaa0 already has epoch 17850
2014-09-24 18:38:42.487755 7f92eede0700 20 osd.12 17850 share_map_peer 0xd0caaa0 already has epoch 17850
2014-09-24 18:38:42.494641 7f92e2dc8700 10 osd.12 pg_epoch: 17850 pg[2.738( v 6870'28894 (4076'25093,6870'28894] local-les=17723 n=537 ec=188 les/c 17723/17725 17722/17722/17709) [57,12,48] r=1 lpr=17722 pi=17199-17721/6 luod=0'0 crt=0'0 lcod 0'0 active] build_scrub_map_chunk done.
2014-09-24 18:38:42.512676 7f92eede0700 20 osd.12 17850 share_map_peer 0x89cdce0 already has epoch 17850
2014-09-24 18:38:42.512900 7f92ed5dd700 20 osd.12 17850 share_map_peer 0x89cdce0 already has epoch 17850
2014-09-24 18:38:42.516290 7f92eede0700 20 osd.12 17850 share_map_peer 0x115ad760 already has epoch 17850
2014-09-24 18:38:42.516326 7f92ed5dd700 20 osd.12 17850 share_map_peer 0x115ad760 already has epoch 17850
2014-09-24 18:38:42.543752 7f92f15e5700 10 osd.12 17850 do_waiters -- start
2014-09-24 18:38:42.543755 7f92f15e5700 10 osd.12 17850 do_waiters -- finish
2014-09-24 18:38:42.543757 7f92f15e5700 20 osd.12 17850 _dispatch 0xdb3e080 replica scrub(pg: 2.738,from:0'0,to:6699'28814,epoch:17850,start:92371f38//0//-1,end:779f2f38//0//-1,chunky:1,deep:1,version:5) v5
2014-09-24 18:38:42.543761 7f92f15e5700 10 osd.12 17850 queueing MOSDRepScrub replica scrub(pg: 2.738,from:0'0,to:6699'28814,epoch:17850,start:92371f38//0//-1,end:779f2f38//0//-1,chunky:1,deep:1,version:5) v5
2014-09-24 18:38:42.543769 7f92f15e5700 10 osd.12 17850 do_waiters -- start
2014-09-24 18:38:42.543771 7f92f15e5700 10 osd.12 17850 do_waiters -- finish
2014-09-24 18:38:42.543805 7f92e2dc8700  7 osd.12 pg_epoch: 17850 pg[2.738( v 6870'28894 (4076'25093,6870'28894] local-les=17723 n=537 ec=188 les/c 17723/17725 17722/17722/17709) [57,12,48] r=1 lpr=17722 pi=17199-17721/6 luod=0'0 crt=0'0 lcod 0'0 active] replica_scrub
2014-09-24 18:38:42.543857 7f92e2dc8700 10 osd.12 pg_epoch: 17850 pg[2.738( v 6870'28894 (4076'25093,6870'28894] local-les=17723 n=537 ec=188 les/c 17723/17725 17722/17722/17709) [57,12,48] r=1 lpr=17722 pi=17199-17721/6 luod=0'0 crt=0'0 lcod 0'0 active] build_scrub_map
2014-09-24 18:38:42.543864 7f92e2dc8700 20 osd.12 pg_epoch: 17850 pg[2.738( v 6870'28894 (4076'25093,6870'28894] local-les=17723 n=537 ec=188 les/c 17723/17725 17722/17722/17709) [57,12,48] r=1 lpr=17722 pi=17199-17721/6 luod=0'0 crt=0'0 lcod 0'0 active] scrub_map_chunk [92371f38//0//-1,779f2f38//0//-1)
2014-09-24 18:38:42.546976 7f92e2dc8700 10 osd.12 pg_epoch: 17850 pg[2.738( v 6870'28894 (4076'25093,6870'28894] local-les=17723 n=537 ec=188 les/c 17723/17725 17722/17722/17709) [57,12,48] r=1 lpr=17722 pi=17199-17721/6 luod=0'0 crt=0'0 lcod 0'0 active] _scan_list scanning 25 objects deeply
2014-09-24 18:38:42.581646 7f92eede0700 20 osd.12 17850 share_map_peer 0x101a1020 already has epoch 17850
2014-09-24 18:38:42.581686 7f92ed5dd700 20 osd.12 17850 share_map_peer 0x101a1020 already has epoch 17850
2014-09-24 18:38:42.632620 7f92eede0700 20 osd.12 17850 share_map_peer 0x1161a3c0 already has epoch 17850
2014-09-24 18:38:42.632750 7f92ed5dd700 20 osd.12 17850 share_map_peer 0x1161a3c0 already has epoch 17850
2014-09-24 18:38:42.675831 7f92fcf9e700  5 osd.12 17850 tick
2014-09-24 18:38:42.675849 7f92fcf9e700 20 osd.12 17850 scrub_random_backoff lost coin flip, randomly backing off
2014-09-24 18:38:42.675850 7f92fcf9e700 10 osd.12 17850 do_waiters -- start
2014-09-24 18:38:42.675851 7f92fcf9e700 10 osd.12 17850 do_waiters -- finish
2014-09-24 18:38:42.723996 7f92ed5dd700 20 osd.12 17850 share_map_peer 0x103b55a0 already has epoch 17850
2014-09-24 18:38:42.724063 7f92eede0700 20 osd.12 17850 share_map_peer 0x103b55a0 already has epoch 17850 sage

>
> 0.59    0       0       0       0       0       0       0       0
> active+clean    2014-09-22 17:50:00.751218      0'0     17842:4472
> [12,41,2]       12      [12,41,2]       12      0'0 2014-09-22
> 16:47:09.315499       0'0     2014-09-16 12:20:48.618726
>
> 0.4d    0       0       0       0       0       0       4       4
> stale+down+peering      2014-09-18 17:51:10.038247      186'4
> 11134:498       [12,56,27]      12      [12,56,27]      12  186'4
> 2014-09-18 17:30:32.393188      0'0     2014-09-16 12:20:48.615322
>
> 0.49    0       0       0       0       0       0       0       0
> stale+down+peering      2014-09-18 17:44:52.681513      0'0
> 11134:498       [12,6,25]       12      [12,6,25]       12  0'0
>      2014-09-18 17:16:12.986658      0'0     2014-09-16
> 12:20:48.614192
>
> 0.1c    0       0       0       0       0       0       12      12
> stale+down+peering      2014-09-18 17:51:16.735549      186'12
> 11134:522       [12,25,23]      12      [12,25,23]      12  186'12
> 2014-09-18 17:16:04.457863      186'10  2014-09-16 14:23:58.731465
>
> 2.17    510     0       0       0       0       2139095040      3001
> 3001    active+clean    2014-09-22 17:52:20.364754      6784'30742
> 17842:72033     [12,27,23]      12      [12,27,23]  12
> 6784'30742
> 2014-09-22 00:19:39.905291      0'0     2014-09-16 20:11:17.016299
>
> 2.7e8   508     0       0       0       0       2130706432      3433
> 3433    active+clean    2014-09-22 17:52:20.365083      6702'21132
> 17842:64769     [12,25,23]      12      [12,25,23]  12
> 6702'21132
> 2014-09-22 17:01:20.546126      0'0     2014-09-16 14:42:32.079187
>
> 2.6a5   528     0       0       0       0       2214592512      2840
> 2840    active+clean    2014-09-22 22:50:38.092084      6775'34416
> 17842:83221     [12,58,0]       12      [12,58,0]   12
> 6775'34416
> 2014-09-22 22:50:38.091989      0'0     2014-09-16 20:11:32.703368
>
>
>
> And we couldn?t observe and peering events happening on the primary osd.
>
>
>
> $ sudo ceph pg 0.49 query
>
> Error ENOENT: i don't have pgid 0.49
>
> $ sudo ceph pg 0.4d query
>
> Error ENOENT: i don't have pgid 0.4d
>
> $ sudo ceph pg 0.1c query
>
> Error ENOENT: i don't have pgid 0.1c
>
>
>
> Not able to explain why the peering was stuck. BTW, Rbd pool doesn?t 
> contain any data.
>
>
>
> Varada
>
>
>
> From: Ceph-community [mailto:ceph-community-bounces at lists.ceph.com] On 
> Behalf Of Sage Weil
> Sent: Monday, September 22, 2014 10:44 PM
> To: Sahana Lokeshappa; ceph-users at lists.ceph.com; ceph-users at ceph.com; 
> ceph-community at lists.ceph.com
> Subject: Re: [Ceph-community] Pgs are in stale+down+peering state
>
>
>
> Stale means that the primary OSD for the PG went down and the status 
> is stale.  They all seem to be from OSD.12... Seems like something is 
> preventing that OSD from reporting to the mon?
>
> sage
>
>
>
> On September 22, 2014 7:51:48 AM EDT, Sahana Lokeshappa 
> <Sahana.Lokeshappa at sandisk.com> wrote:
>
>       Hi all,
>
>
>
>       I used command  ?ceph osd thrash ? command and after all osds are up
>       and in, 3  pgs are in  stale+down+peering state
>
>
>
>       sudo ceph -s
>
>           cluster 99ffc4a5-2811-4547-bd65-34c7d4c58758
>
>            health HEALTH_WARN 3 pgs down; 3 pgs peering; 3 pgs stale;
>       3 pgs stuck inactive; 3 pgs stuck stale; 3 pgs stuck unclean
>
>            monmap e1: 3 mons at{rack2-ram-1=10.242.42.180:6789/0,rack2-ram-2=10.242.42.184:6789/0,rack2-ra
>       m-3=10.242.42.188:6789/0}, election epoch 2008, quorum 0,1,2
>       rack2-ram-1,rack2-ram-2,rack2-ram-3
>
>            osdmap e17031: 64 osds: 64 up, 64 in
>
>             pgmap v76728: 2148 pgs, 2 pools, 4135 GB data, 1033
>       kobjects
>
>                   12501 GB used, 10975 GB / 23476 GB avail
>
>                       2145 active+clean
>
>                          3 stale+down+peering
>
>
>
>       sudo ceph health detail
>
>       HEALTH_WARN 3 pgs down; 3 pgs peering; 3 pgs stale; 3 pgs stuck
>       inactive; 3 pgs stuck stale; 3 pgs stuck unclean
>
>       pg 0.4d is stuck inactive for 341048.948643, current state
>       stale+down+peering, last acting [12,56,27]
>
>       pg 0.49 is stuck inactive for 341048.948667, current state
>       stale+down+peering, last acting [12,6,25]
>
>       pg 0.1c is stuck inactive for 341048.949362, current state
>       stale+down+peering, last acting [12,25,23]
>
>       pg 0.4d is stuck unclean for 341048.948665, current state
>       stale+down+peering, last acting [12,56,27]
>
>       pg 0.49 is stuck unclean for 341048.948687, current state
>       stale+down+peering, last acting [12,6,25]
>
>       pg 0.1c is stuck unclean for 341048.949382, current state
>       stale+down+peering, last acting [12,25,23]
>
>       pg 0.4d is stuck stale for 339823.956929, current state
>       stale+down+peering, last acting [12,56,27]
>
>       pg 0.49 is stuck stale for 339823.956930, current state
>       stale+down+peering, last acting [12,6,25]
>
>       pg 0.1c is stuck stale for 339823.956925, current state
>       stale+down+peering, last acting [12,25,23]
>
>
>
>
>
>       Please, can anyone explain why pgs are in this state.
>
>       Sahana Lokeshappa
>       Test Development Engineer I
>       SanDisk Corporation
>       3rd Floor, Bagmane Laurel, Bagmane Tech Park
>
>       C V Raman nagar, Bangalore 560093
>       T: +918042422283
>
>       Sahana.Lokeshappa at SanDisk.com
>
>
>
>
>
>
> ______________________________________________________________________
> ______
>
>
>
>       PLEASE NOTE: The information contained in this electronic mail
>       message is intended only for the use of the designated
>       recipient(s) named above. If the reader of this message is not
>       the intended recipient, you are hereby notified that you have
>       received this message in error and that any review,
>       dissemination, distribution, or copying of this message is
>       strictly prohibited. If you have received this communication in
>       error, please notify the sender by telephone or e-mail (as shown
>       above) immediately and destroy any and all copies of this
>       message in your possession (whether hard copies or
>       electronically stored copies).
>
> ______________________________________________________________________
> ______
>
> Ceph-community mailing list
> Ceph-community at lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-community-ceph.com
>
>
> --
> Sent from Kaiten Mail. Please excuse my brevity.
>
>
>

________________________________

PLEASE NOTE: The information contained in this electronic mail message is intended only for the use of the designated recipient(s) named above. If the reader of this message is not the intended recipient, you are hereby notified that you have received this message in error and that any review, dissemination, distribution, or copying of this message is strictly prohibited. If you have received this communication in error, please notify the sender by telephone or e-mail (as shown above) immediately and destroy any and all copies of this message in your possession (whether hard copies or electronically stored copies).

_______________________________________________
ceph-users mailing list
ceph-users at lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com