Re: Clarification on sequence of recovery and client ops after OSDs rejoin cluster (also, slow requests)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

(thanks to Florian who’s helping us getting this sorted out)

> On Sep 13, 2017, at 12:40 PM, Florian Haas <florian@xxxxxxxxxxx> wrote:
> 
> Hi everyone,
> 
> 
> disclaimer upfront: this was seen in the wild on Hammer, and on 0.94.7
> no less. Reproducing this on 0.94.10 is a pending process, and we'll
> update here with findings, but my goal with this post is really to
> establish whether the behavior as seen is expected, and if so, what
> the rationale for it is. This is all about slow requests popping up on
> a rather large scale after a previously down OSD node is brought back
> into the cluster.
> 
> 
> So here's the sequence of events for the issue I'm describing, as seen
> in a test:
> 
> 22:08:53 - OSD node stopped. OSDs 6, 17, 18, 22, 31, 32, 36, 45, 58
> mark themselves down. Cluster has noout set, so all OSDs remain in.
> fio tests are running against RBD in a loop, thus there is heavy
> client I/O activity generating lots of new objects.

Sorry, this got confused. This was on our production setup with regular production traffic, not FIO. We did run this previously on our DEV cluster and saw similar effects though on a somewhat smaller scale regarding the actual timings.

On this instance of OSD (and host) reboots I did forget to set noout and we did have backfill in addition to recovery for a while. However, I did set the OSDs back “in” before they came back online. Nevertheless, the behaviour has been identical with previous host restarts.

Also, thanks in advance for any light that can be shed on this,
Christian

--
Christian Theune · ct@xxxxxxxxxxxxxxx · +49 345 219401 0
Flying Circus Internet Operations GmbH · http://flyingcircus.io
Forsterstraße 29 · 06112 Halle (Saale) · Deutschland
HR Stendal HRB 21169 · Geschäftsführer: Christian Theune, Christian Zagrodnick

Attachment: signature.asc
Description: Message signed with OpenPGP

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux