Re: ceph recovering results in offline VMs

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Very interesting. I ran into the same thing yesterday when I added SATA disks to the cluster. I was about to return them for SAS drives instead because of how long it took, and how slow some of my RBDs got.

Are most people using SATA 7200 RPM drives? My concern was with Oracle DBs. Postgres doesn't seem to have as much of a problem running on an RBD, but I noticed a marked difference with Oracle.

Dave Spano




From: "Stefan Priebe - Profihost AG" <s.priebe@xxxxxxxxxxxx>
To: "Wido den Hollander" <wido@xxxxxxxx>
Cc: ceph-users@xxxxxxxxxxxxxx
Sent: Wednesday, April 10, 2013 3:51:23 PM
Subject: Re: ceph recovering results in offline VMs

Am 10.04.2013 um 21:36 schrieb Wido den Hollander <wido@xxxxxxxx>:

> On 04/10/2013 09:16 PM, Stefan Priebe wrote:
>> Hello list,
>>
>> i'm using ceph 0.56.4 and i've to replace some drives. But while ceph is
>> backfilling / recovering all VMs have high latencies and sometimes
>> they're even offline. I just replace one drive at a time.
>>
>> I putted in the new drives and i'm reweighting them from 0.0 to 1.0 in
>> 0.1 steps.
>>
>> I already lowered osd recovery max active = 2 and osd max backfills = 3,
>> but when i put them back at 1.0 the vms are nearly all down.
>>
>> Right now some drives are SSDs so they're a lot faster than the HDDs i'm
>> going to replace them too.
>>
>> Nothing in the logs but it is recovering at 3700MB/s that this is not
>> possible on SATA HDDs is clear.
>>
>> Log example:
>> 2013-04-10 20:55:33.711289 mon.0 [INF] pgmap v9293315: 8128 pgs: 233
>> active, 7876 active+clean, 19 active+recovery_wait; 557 GB data, 1168 GB
>> used, 7003 GB / 8171 GB avail; 2108KB/s wr, 329op/s; 31/309692 degraded
>> (0.010%);  recovering 840 o/s, 3278MB/s
>
> There is a issue about this in the tracker, I saw it this week but I'm not able to find it anymore.

3737?

> I'm seeing this as well, when the cluster is recovering RBD images tend to get very sluggish.
>
> Most of the time I'm blaiming the CPUs in the OSDs for it, but I've also seen it on faster systems.

I've 3,6Ghz xeons with just 4 osds per host.

Stefan
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux