Re: Stuck in stale state

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



"nothing to send, going to standby" isn't necessarily bad, I see it from time to time.  It shouldn't stay like that for long though.  If it's been 5 minutes, and the cluster still isn't doing anything, I'd restart that osd.

On Fri, Nov 7, 2014 at 1:55 PM, Jan Pekař <jan.pekar@xxxxxxxxx> wrote:
Hi,

I was testing ceph cluster map changes and I got to stuck state which seems to be indefinite.
First my description what I have done.

I'm testing special case with only one copy of pg's (pool size = 1).

All pg's was on one osd.0. I created second osd.1 and modified cluster map to transfer one pool (metadata) to the newly created osd.1
PG's started to remap and "objects degraded number" was dropping - so everything looked normal.

During that recovery process I restarted both osd daemons.
After that I noticed, that pg's, that should be remapped had "stale" state - stale+active+remapped+backfilling and other object with stale state .
I tried to run ceph pg force_create_pg on one pg, that should be remapped, but nothing changed (that is 1 stuck / creating PG below in ceph health)

Command rados -p metadata ls hangs so data are unavailable, but it should be there.

What should I do in this state to get it working?

ceph -s below:

    cluster 93418692-8e2e-4689-a237-ed5b47f39f72
     health HEALTH_WARN 52 pgs backfill; 1 pgs backfilling; 63 pgs stale; 1 pgs stuck inactive; 63 pgs stuck stale; 54 pgs stuck unclean; recovery 107232/1881806 objects degraded (5.698%); mon.imatic-mce low disk space
     monmap e1: 1 mons at {imatic-mce=192.168.11.165:6789/0}, election epoch 1, quorum 0 imatic-mce
     mdsmap e450: 1/1/1 up {0=imatic-mce=up:active}
     osdmap e275: 2 osds: 2 up, 2 in
      pgmap v51624: 448 pgs, 4 pools, 790 GB data, 1732 kobjects
            804 GB used, 2915 GB / 3720 GB avail
            107232/1881806 objects degraded (5.698%)
                  52 stale+active+remapped+wait_backfill
                   1 creating
                   1 stale+active+remapped+backfilling
                  10 stale+active+clean
                 384 active+clean

Last message in OSD log's:

2014-11-07 22:17:45.402791 deb4db70  0 -- 192.168.11.165:6804/29564 >> 192.168.11.165:6807/29939 pipe(0x9d52f00 sd=213 :53216 s=2 pgs=1 cs=1 l=0 c=0x2c7f58c0).fault with nothing to send, going to standby

Thank you for help
With regards
Jan Pekar, ceph fan
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux