Re: latency when OSD falls out of cluster

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 07/12/2013 09:21 AM, Wido den Hollander wrote:

You will probably see that Peering Groups (PGs) go into a different
state then active+clean.

Indeed, the cluster goes into a health warning state and starts to resync the data for the affected OSDs. Nothing is missing, just degraded (redundancy level is 2).

Not really the expected behavior, but it could be CPU power limitations
on the OSDs. I notice this latency with a Atom cluster as well, but
that's mainly due to the fact that the Atoms aren't fast enough to
figure out what's happening.

They are fairly meaty hosts - all of them quad core 3.2 GHz Xeons, however, we do run VMs on the same boxes (contrary to recommended practice). The hosts are lightly loaded though, with load averages seldom heading north of 1.0.

Faster AMD or Intel CPUs don't suffer from this. There will be a very
short I/O stall for certain PGs when an OSD goes down, but that should
be very short and not every VM should suffer.

How many OSDs do you have with how many PGs per pool?

1000 PGs, 10 OSDs (2 per host). The number of PGs may be a little high, but we plan to add more hosts and consequently OSDs to the cluster as time goes on and I was worried about splitting PGs later.

I guess it may be limited to only the affected PGs, I'm not sure, but every VM I've cared about (or have been watching) so far has been affected.

Seconds of down time is quite severe, especially when it is a planned shut down or rejoining. I can understand if an OSD just disappears, that some requests might be directed to the now gone node, but I see similar latency hiccups on scheduled shut downs and rejoins too?

Regards,
Edwin Peer
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux