Sage,
The crush tunables did the trick.
why? Could you explain what was causing the problem?
I've haven't installed 3.9 on my RBD servers yet. Will setting crush
tunables back to default or legacy cause me similar problems in the future?
Thank you again Sage!
-Gaylord
On 07/22/2013 02:27 PM, Sage Weil wr:
On Mon, 22 Jul 2013, Gaylord Holder wrote:
I have a 12 OSD/3 host set up, and have be stuck with a bunch of stuck pages.
I've verified the OSDs are all up and in. The crushmap looks fine.
I've tried restarting all the daemons.
root@never:/var/lib/ceph/mon# ceph status
health HEALTH_WARN 139 pgs degraded; 461 pgs stuck unclean; recovery
216/6213 degraded (3.477%)
monmap e4: 2 mons at {a=192.168.225.9:6789/0,b=192.168.225.10:6789/0},
election epoch 14, quorum 0,1 a,b
Add another monitor; right now if 1 fails the cluster is unavailable.
osdmap e238: 12 osds: 12 up, 12 in
pgmap v7396: 2528 pgs: 2067 active+clean, 322 active+remapped, 139
active+degraded; 8218 MB data, 103 GB used, 22241 GB / 22345 GB avail;
216/6213 degraded (3.477%)
mdsmap e1: 0/0/1 up
My guess crush tunables. Try
ceph osd crush tunables optimal
unless you are using a pre-3.8(ish) kernel or other very old (pre-bobtail)
clients.
sage
I have one non-default pool with 3x replication. Fewer than half of the pg
have expanded to 3x (278/400 pgs still have acting 2x sets).
Where can I go look for the trouble?
Thank you for any light someone can shed on this.
Cheers,
-Gaylord
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com