Re: How To Properly Failover a HA Setup

David C <dcsysengineer@xxxxxxxxx> · Mon, 21 Jan 2019 12:31:34 +0000

It could also be the kernel client versions, what are you running? I remember older kernel clients didn't always deal with recovery scenarios very well.

On Mon, Jan 21, 2019 at 9:18 AM Marc Roos <M.Roos@xxxxxxxxxxxxxxxxx> wrote:

I think his downtime is coming from the mds failover, that takes a while 

in my case to. But I am not using the cephfs that much yet. 

-----Original Message-----

From: Robert Sander [mailto:r.sander@xxxxxxxxxxxxxxxxxxx] 

Sent: 21 January 2019 10:05

To: ceph-users@xxxxxxxxxxxxxx

Subject: Re:  How To Properly Failover a HA Setup

On 21.01.19 09:22, Charles Tassell wrote:

> Hello Everyone,

> 

>    I've got a 3 node Jewel cluster setup, and I think I'm missing 

> something.  When I want to take one of my nodes down for maintenance 

> (kernel upgrades or the like) all of my clients (running the kernel 

> module for the cephfs filesystem) hang for a couple of minutes before 

> the redundant servers kick in.

Have you set the noout flag before doing cluster maintenance?

ceph osd set noout

and afterwards

ceph osd unset noout

Regards

--

Robert Sander

Heinlein Support GmbH

Schwedter Str. 8/9b, 10119 Berlin

https://www.heinlein-support.de

Tel: 030 / 405051-43

Fax: 030 / 405051-19

Amtsgericht Berlin-Charlottenburg - HRB 93818 B

Geschäftsführer: Peer Heinlein - Sitz: Berlin

_______________________________________________

ceph-users mailing list

ceph-users@xxxxxxxxxxxxxx

http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com