Re: Ceph pgs stuck or degraded.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



If I understand what the #tunables page is saying, changing the tunables kicks the OSD re-balancing mechanism a bit and resets it to try again.

I'll see about getting 3.9 kernel in for my RBD maachines, and reset everything to optimal.

Thanks again.

-Gaylord

On 07/22/2013 04:51 PM, Sage Weil wrote:
On Mon, 22 Jul 2013, Gaylord Holder wrote:
Sage,

The crush tunables did the trick.

why?  Could you explain what was causing the problem?

This has a good explanation, I think:

	http://ceph.com/docs/master/rados/operations/crush-map/#tunables

I've haven't installed 3.9 on my RBD servers yet.  Will setting crush tunables
back to default or legacy cause me similar problems in the future?

Yeah.  For 3.6+ kernels, you can set slightly different tunables and it
will be very close to optimal...

sage



Thank you again Sage!

-Gaylord

On 07/22/2013 02:27 PM, Sage Weil wr:
On Mon, 22 Jul 2013, Gaylord Holder wrote:

I have a 12 OSD/3 host set up, and have be stuck with a bunch of stuck
pages.

I've verified the OSDs are all up and in.  The crushmap looks fine.
I've tried restarting all the daemons.



root@never:/var/lib/ceph/mon# ceph status
     health HEALTH_WARN 139 pgs degraded; 461 pgs stuck unclean; recovery
216/6213 degraded (3.477%)
     monmap e4: 2 mons at {a=192.168.225.9:6789/0,b=192.168.225.10:6789/0},
election epoch 14, quorum 0,1 a,b

Add another monitor; right now if 1 fails the cluster is unavailable.

     osdmap e238: 12 osds: 12 up, 12 in
      pgmap v7396: 2528 pgs: 2067 active+clean, 322 active+remapped, 139
active+degraded; 8218 MB data, 103 GB used, 22241 GB / 22345 GB avail;
216/6213 degraded (3.477%)
     mdsmap e1: 0/0/1 up

My guess crush tunables.  Try

   ceph osd crush tunables optimal

unless you are using a pre-3.8(ish) kernel or other very old (pre-bobtail)
clients.

sage




I have one non-default pool with 3x replication.  Fewer than half of the
pg
have expanded to 3x (278/400 pgs still have acting 2x sets).

Where can I go look for the trouble?

Thank you for any light someone can shed on this.

Cheers,
-Gaylord
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com




_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux