Degraded PGs blocking open()?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi all,

I have a three node ceph setup, two nodes playing all three roles (OSD, MDS, 
MON), and one being just a monitor (which happens to be the client I'm using 
the filesystem from).

I want to achieve high availablity by mirroring all data between the OSDs and 
being able to still access everything even if one of them goes down. The 
mirroring works fine, I see the space being consumed on both nodes as I copy 
data on the file system. According to `ceph -s`, all PGs are in active+clean 
state. If I start reading a big file and shut down one of the (OSD+MDS+MON) 
nodes, the file can still be read until the end, that's fine. Moreover, the 
contents read back seem correct when compared to the original file. Very nice. 
But if I start reading the file while one of the nodes is down, it blocks until 
the node comes up again. I can't even kill the reading process with KILL, 
TERM, or INT.

Am I doing something wrong, or was not careful enough reading the docs, or may 
this be a bug? My ceph.conf is attached.

Thanks,
-- 
cc

[global]
auth supported = cephx
keyring = /etc/ceph/keyring.$name



[mds]

[mds.0]
host = iscsigw1

[mds.1]
host = iscsigw2



[osd]
osd data = /srv/ceph/osd.$id

[osd.0]
host = iscsigw1

[osd.1]
host = iscsigw2



[mon]
mon data = /srv/ceph/mon.$id

[mon.0]
host = iscsigw1
mon addr = <node1_ip>:6789

[mon.1]
host = iscsigw2
mon addr = <node2_ip>:6789

[mon.cc]
host = cc
mon addr = <node3_ip>:6789

[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux