Thank you, sorry for bothering, I was new to ceph-users list and I
couldn't cancel my message. I found out what happened few hours later.
Main problem was, that I moved one OSD from "host hostname {}" crush map
entry (I wanted to do so). Everything was OK, but restart of OSD caused
automatic osd placement back into "host hostname {}" crush map section
again.
I solved it with
osd crush update on start = false
see ceph-crush-location hook
http://ceph.com/docs/master/rados/operations/crush-map/
You can consider solved, no problem with CEPH, only my poor knowledge
caused that.
JP
On 2014-11-10 20:53, Craig Lewis wrote:
"nothing to send, going to standby" isn't necessarily bad, I see it from
time to time. It shouldn't stay like that for long though. If it's
been 5 minutes, and the cluster still isn't doing anything, I'd restart
that osd.
On Fri, Nov 7, 2014 at 1:55 PM, Jan Pekař <jan.pekar@xxxxxxxxx
<mailto:jan.pekar@xxxxxxxxx>> wrote:
Hi,
I was testing ceph cluster map changes and I got to stuck state
which seems to be indefinite.
First my description what I have done.
I'm testing special case with only one copy of pg's (pool size = 1).
All pg's was on one osd.0. I created second osd.1 and modified
cluster map to transfer one pool (metadata) to the newly created osd.1
PG's started to remap and "objects degraded number" was dropping -
so everything looked normal.
During that recovery process I restarted both osd daemons.
After that I noticed, that pg's, that should be remapped had "stale"
state - stale+active+remapped+__backfilling and other object with
stale state .
I tried to run ceph pg force_create_pg on one pg, that should be
remapped, but nothing changed (that is 1 stuck / creating PG below
in ceph health)
Command rados -p metadata ls hangs so data are unavailable, but it
should be there.
What should I do in this state to get it working?
ceph -s below:
cluster 93418692-8e2e-4689-a237-__ed5b47f39f72
health HEALTH_WARN 52 pgs backfill; 1 pgs backfilling; 63 pgs
stale; 1 pgs stuck inactive; 63 pgs stuck stale; 54 pgs stuck
unclean; recovery 107232/1881806 objects degraded (5.698%);
mon.imatic-mce low disk space
monmap e1: 1 mons at {imatic-mce=192.168.11.165:__6789/0
<http://192.168.11.165:6789/0>}, election epoch 1, quorum 0 imatic-mce
mdsmap e450: 1/1/1 up {0=imatic-mce=up:active}
osdmap e275: 2 osds: 2 up, 2 in
pgmap v51624: 448 pgs, 4 pools, 790 GB data, 1732 kobjects
804 GB used, 2915 GB / 3720 GB avail
107232/1881806 objects degraded (5.698%)
52 stale+active+remapped+wait___backfill
1 creating
1 stale+active+remapped+__backfilling
10 stale+active+clean
384 active+clean
Last message in OSD log's:
2014-11-07 22:17:45.402791 deb4db70 0 -- 192.168.11.165:6804/29564
<http://192.168.11.165:6804/29564> >> 192.168.11.165:6807/29939
<http://192.168.11.165:6807/29939> pipe(0x9d52f00 sd=213 :53216 s=2
pgs=1 cs=1 l=0 c=0x2c7f58c0).fault with nothing to send, going to
standby
Thank you for help
With regards
Jan Pekar, ceph fan
_________________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx <mailto:ceph-users@xxxxxxxxxxxxxx>
http://lists.ceph.com/__listinfo.cgi/ceph-users-ceph.__com
<http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com>
--
============
Ing. Jan Pekař
jan.pekar@xxxxxxxxx | +420603811737
----
Imatic | Jagellonská 14 | Praha 3 | 130 00
http://www.imatic.cz
============
--
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com