On Mon, Jun 20, 2016 at 8:33 AM, Daniel Swarbrick <daniel.swarbrick@xxxxxxxxxxxxxxxx> wrote: > We have just updated our third cluster from Infernalis to Jewel, and are > experiencing similar issues. > > We run a number of KVM virtual machines (qemu 2.5) with RBD images, and > have seen a lot of D-state processes and even jbd/2 timeouts and kernel > stack traces inside the guests. At first I thought the VMs were being > starved of IO, but this is still happening after throttling back the > recovery with: > > osd_max_backfills = 1 > osd_recovery_max_active = 1 > osd_recovery_op_priority = 1 > > After upgrading the cluster to Jewel, I changed our crushmap to use the > newer straw2 algorithm, which resulted in a little data movment, but no > problems at that stage. > > Once the cluster had settled down again, I set tunables to optimal > (hammer profile -> jewel profile), which has triggered between 50% and > 70% misplaced PGs on our clusters. This is when the trouble started each > time, and when we had cascading failures of VMs. > > However, after performing hard shutdowns on the VMs and restarting them, > they seemed to be OK. > > At this stage, I have a strong suspicion that it is the introduction of > "require_feature_tunables5 = 1" in the tunables. This seems to require > all RADOS connections to be re-established. Do you have any evidence of that besides the one restart? I guess it's possible that we aren't kicking requests if the crush map but not the rest of the osdmap changes, but I'd be surprised. -Greg > > > On 20/06/16 13:54, Andrei Mikhailovsky wrote: >> Hi Oliver, >> >> I am also seeing this as a strange behavriour indeed! I was going through the logs and I was not able to find any errors or issues. There was also no slow/blocked requests that I could see during the recovery process. >> >> Does anyone has an idea what could be the issue here? I don't want to shut down all vms every time there is a new release with updated tunable values. >> >> >> Andrei >> >> >> >> ----- Original Message ----- >>> From: "Oliver Dzombic" <info@xxxxxxxxxxxxxxxxx> >>> To: "andrei" <andrei@xxxxxxxxxx>, "ceph-users" <ceph-users@xxxxxxxxxxxxxx> >>> Sent: Sunday, 19 June, 2016 10:14:35 >>> Subject: Re: cluster down during backfilling, Jewel tunables and client IO optimisations >> >>> Hi, >>> >>> so far the key values for that are: >>> >>> osd_client_op_priority = 63 ( anyway default, but i set it to remember it ) >>> osd_recovery_op_priority = 1 >>> >>> >>> In addition i set: >>> >>> osd_max_backfills = 1 >>> osd_recovery_max_active = 1 >>> > > > _______________________________________________ > ceph-users mailing list > ceph-users@xxxxxxxxxxxxxx > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com