On 07/12/2013 09:21 AM, Wido den Hollander wrote:
You will probably see that Peering Groups (PGs) go into a different state then active+clean.
Indeed, the cluster goes into a health warning state and starts to resync the data for the affected OSDs. Nothing is missing, just degraded (redundancy level is 2).
Not really the expected behavior, but it could be CPU power limitations on the OSDs. I notice this latency with a Atom cluster as well, but that's mainly due to the fact that the Atoms aren't fast enough to figure out what's happening.
They are fairly meaty hosts - all of them quad core 3.2 GHz Xeons, however, we do run VMs on the same boxes (contrary to recommended practice). The hosts are lightly loaded though, with load averages seldom heading north of 1.0.
Faster AMD or Intel CPUs don't suffer from this. There will be a very short I/O stall for certain PGs when an OSD goes down, but that should be very short and not every VM should suffer. How many OSDs do you have with how many PGs per pool?
1000 PGs, 10 OSDs (2 per host). The number of PGs may be a little high, but we plan to add more hosts and consequently OSDs to the cluster as time goes on and I was worried about splitting PGs later.
I guess it may be limited to only the affected PGs, I'm not sure, but every VM I've cared about (or have been watching) so far has been affected.
Seconds of down time is quite severe, especially when it is a planned shut down or rejoining. I can understand if an OSD just disappears, that some requests might be directed to the now gone node, but I see similar latency hiccups on scheduled shut downs and rejoins too?
Regards, Edwin Peer _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com