ceph osd crush tunables optimal AND add new OSD at the same time

andrija.panic@xxxxxxxxx (Andrija Panic) · Mon, 14 Jul 2014 11:04:02 +0200

Hi Andrei, nice to meet you again ;)

Thanks for sharing this info with me - I though it was my mistake by
introducing new OSD components at the same time - I though that since it's
rebalancing, let's add those new OSD, so it also rebalances - so I don't
have to cause 2 data rebalancing  - but during normal OSD restart and data
rebalancing (I did not set osd noout etc...) I did have somehat lower VM
performacne, but it was all UP and fine.

Also 30% of data moving during my upgrade/tunables change... although
documents say 10% as you said.

Did not lost any data, but finding all VMs that use CEPH as storage, is
somewhat PITA...

So, any CEPH developers input would be greatly appriciated...

Thanks agan for such detailed info,
Andrija

On 14 July 2014 10:52, Andrei Mikhailovsky <andrei at arhont.com> wrote:

> Hi Andrija,
>
> I've got at least two more stories of similar nature. One is my friend
> running a ceph cluster and one is from me. Both of our clusters are pretty
> small. My cluster has only two osd servers with 8 osds each, 3 mons. I have
> an ssd journal per 4 osds. My friend has a cluster of 3 mons and 3 osd
> servers with 4 osds each and an ssd per 4 osds as well. Both clusters are
> connected with 40gbit/s IP over Infiniband links.
>
> We had the same issue while upgrading to firefly. However, we did not add
> any new disks, just ran the "ceph osd crush tunables optimal" command after
> following an upgrade.
>
> Both of our clusters were "down" as far as the virtual machines are
> concerned. All vms have crashed because of the lack of IO. It was a bit
> problematic, taking into account that ceph is typically so great at staying
> alive during failures and upgrades. So, there seems to be a problem with
> the upgrade. I wish devs would have added a big note in red letters that if
> you run this command it will likely affect your cluster performance and
> most likely all your vms will die. So, please shutdown your vms if you do
> not want to have data loss.
>
> I've changed the default values to reduce the load during recovery and
> also to tune a few things performance wise. My settings were:
>
> osd recovery max chunk = 8388608
>
> osd recovery op priority = 2
>
> osd max backfills = 1
>
> osd recovery max active = 1
>
> osd recovery threads = 1
>
> osd disk threads = 2
>
> filestore max sync interval = 10
>
> filestore op threads = 20
>
> filestore_flusher = false
>
> However, this didn't help much and i've noticed that shortly after running
> the tunnables command my guest vms iowait has quickly jumped to 50% and a
> to 99% a minute after. This has happened on all vms at once. During the
> recovery phase I ran the "rbd -p <poolname> ls -l" command several times
> and it took between 20-40 minutes to complete. It typically takes less than
> 2 seconds when the cluster is not in recovery mode.
>
> My mate's cluster had the same tunables apart from the last three. He had
> exactly the same behaviour.
>
> One other thing that i've noticed is that somewhere in the docs I've read
> that running the tunnable optimal command should move not more than 10% of
> your data. However, in both of our cases our status was just over 30%
> degraded and it took a good part of 9 hours to complete the data
> reshuffling.
>
>
> Any comments from the ceph team or other ceph gurus on:
>
> 1. What have we done wrong in our upgrade  process
> 2. What options should we have used to keep our vms alive
>
>
> Cheers
>
> Andrei
>
>
>
>
> ------------------------------
> *From: *"Andrija Panic" <andrija.panic at gmail.com>
> *To: *ceph-users at lists.ceph.com
> *Sent: *Sunday, 13 July, 2014 9:54:17 PM
> *Subject: *[ceph-users] ceph osd crush tunables optimal AND add new OSD
> at the        same time
>
>
> Hi,
>
> after seting ceph upgrade (0.72.2 to 0.80.3) I have issued "ceph osd crush
> tunables optimal" and after only few minutes I have added 2 more OSDs to
> the CEPH cluster...
>
> So these 2 changes were more or a less done at the same time - rebalancing
> because of tunables optimal, and rebalancing because of adding new OSD...
>
> Result - all VMs living on CEPH storage have gone mad, no disk access
> efectively, blocked so to speak.
>
> Since this rebalancing took 5h-6h, I had bunch of VMs down for that long...
>
> Did I do wrong by causing "2 rebalancing" to happen at the same time ?
> Is this behaviour normal, to cause great load on all VMs because they are
> unable to access CEPH storage efectively ?
>
> Thanks for any input...
> --
>
> Andrija Pani?
>
> _______________________________________________
> ceph-users mailing list
> ceph-users at lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>

-- 

Andrija Pani?
--------------------------------------
  http://admintweets.com
--------------------------------------
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ceph.com/pipermail/ceph-users-ceph.com/attachments/20140714/9f23ccdc/attachment.htm>