Hello Everyone,
I've got a 3 node Jewel cluster setup, and I think I'm missing
something. When I want to take one of my nodes down for maintenance
(kernel upgrades or the like) all of my clients (running the kernel
module for the cephfs filesystem) hang for a couple of minutes before
the redundant servers kick in. Is there some commands I can enter
before taking a box down to speed this up? Or some way to have the
clients/cluster detect failure quicker?
My setup is as follows. I have three servers: mon1, file1 and
file2. All three boxes are running monitor daemons. file1 and file2
are also running MDS and OSD daemons. My ceph -s looks as such:
cluster 8ab75485-8141-44ad-8eee-f92f286515ac
health HEALTH_OK
monmap e2: 3 mons at
{FILE1=10.1.1.201:6789/0,FILE2=10.1.1.202:6789/0,MON1=10.1.1.90:6789/0}
election epoch 1384, quorum 0,1,2 MON1,FILE1,FILE2
fsmap e63874: 1/1/1 up {0=FILE1=up:active}, 1 up:standby
osdmap e639: 2 osds: 2 up, 2 in
flags sortbitwise,require_jewel_osds
pgmap v15312803: 128 pgs, 3 pools, 236 GB data, 729 kobjects
493 GB used, 406 GB / 899 GB avail
128 active+clean
client io 3243 kB/s rd, 776 kB/s wr, 8 op/s rd, 1 op/s wr
and my client fstabs look like this:
10.1.1.90:6789,10.1.1.201:6789,10.1.1.202:6789:/shared /mnt/shared ceph
noatime,_netdev,name=webdata,secretfile=/etc/ceph/websecret 0 0
Any help would be appreciated.
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com