On Tue, 15 Jul 2014, Andrija Panic wrote: > Hi Sage, since this problem is tunables-related, do we need to expect > same behavior or not ?when we do regular data rebalancing caused by > adding new/removing OSD? I guess not, but would like your confirmation. > I'm already on optimal tunables, but I'm afraid to test this by i.e. > shuting down 1 OSD. When you shut down a single OSD it is a relativey small amount of data that needs to move to do the recovery. The issue with the tunables is just that a huge fraction of the data stored needs to move, and the performance impact is much higher. sage > > Thanks, > Andrija > > > On 14 July 2014 18:18, Sage Weil <sweil at redhat.com> wrote: > I've added some additional notes/warnings to the upgrade and > release > notes: > > ?https://github.com/ceph/ceph/commit/fc597e5e3473d7db6548405ce347ca77328324 > 51 > > If there is somewhere else where you think a warning flag would > be useful, > let me know! > > Generally speaking, we want to be able to cope with huge data > rebalances > without interrupting service. ?It's an ongoing process of > improving the > recovery vs client prioritization, though, and removing sources > of > overhead related to rebalancing... and it's clearly not perfect > yet. :/ > > sage > > > On Sun, 13 Jul 2014, Andrija Panic wrote: > > > Hi, > > after seting ceph upgrade (0.72.2 to 0.80.3) I have issued > "ceph osd crush > > tunables optimal" and after only few minutes I have added 2 > more OSDs to the > > CEPH cluster... > > > > So these 2 changes were more or a less done at the same time - > rebalancing > > because of tunables optimal, and rebalancing because of adding > new OSD... > > > > Result - all VMs living on CEPH storage have gone mad, no disk > access > > efectively, blocked so to speak. > > > > Since this rebalancing took 5h-6h, I had bunch of VMs down for > that long... > > > > Did I do wrong by causing "2 rebalancing" to happen at the > same time ? > > Is this behaviour normal, to cause great load on all VMs > because they are > > unable to access CEPH storage efectively ? > > > > Thanks for any input... > > --? > > > > Andrija Pani? > > > > > > > > > -- > > Andrija Pani? > -------------------------------------- > ? http://admintweets.com > -------------------------------------- > >