Am 10.04.2013 um 21:36 schrieb Wido den Hollander <wido@xxxxxxxx>: > On 04/10/2013 09:16 PM, Stefan Priebe wrote: >> Hello list, >> >> i'm using ceph 0.56.4 and i've to replace some drives. But while ceph is >> backfilling / recovering all VMs have high latencies and sometimes >> they're even offline. I just replace one drive at a time. >> >> I putted in the new drives and i'm reweighting them from 0.0 to 1.0 in >> 0.1 steps. >> >> I already lowered osd recovery max active = 2 and osd max backfills = 3, >> but when i put them back at 1.0 the vms are nearly all down. >> >> Right now some drives are SSDs so they're a lot faster than the HDDs i'm >> going to replace them too. >> >> Nothing in the logs but it is recovering at 3700MB/s that this is not >> possible on SATA HDDs is clear. >> >> Log example: >> 2013-04-10 20:55:33.711289 mon.0 [INF] pgmap v9293315: 8128 pgs: 233 >> active, 7876 active+clean, 19 active+recovery_wait; 557 GB data, 1168 GB >> used, 7003 GB / 8171 GB avail; 2108KB/s wr, 329op/s; 31/309692 degraded >> (0.010%); recovering 840 o/s, 3278MB/s > > There is a issue about this in the tracker, I saw it this week but I'm not able to find it anymore. 3737? > I'm seeing this as well, when the cluster is recovering RBD images tend to get very sluggish. > > Most of the time I'm blaiming the CPUs in the OSDs for it, but I've also seen it on faster systems. I've 3,6Ghz xeons with just 4 osds per host. Stefan _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com