CephFS + CTDB/Samba - MDS session timeout on lockfile

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi All,

I've been testing an active/active Samba cluster over CephFS, performance
seems really good with small files compared to Gluster. Soft reboots work
beautifully with little to no interruption in file access. However when I
perform a hard shutdown/reboot of one of the samba nodes, the remaining node
detects that the other Samba node has disappeared but then eventually bans
itself. If I leave everything for around 5 minutes, CTDB unbans itself and
then everything continues running.

>From what I can work out it looks like as the MDS has a stale session from
the powered down node, it won't let the remaining node access the CTDB lock
file (which is also sitting the on the CephFS). CTDB meanwhile is hammering
away trying to access the lock file, but it sees what it thinks is a split
brain scenario because something still has a lock on the lockfile, and so
bans itself.

I'm guessing the solution is to either reduce the mds session timeout or
increase the amount of time/retries for CTDB, but I'm not sure what's the
best approach. Does anyone have any ideas?

Nick

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux