I am trying to do some similar testing with SAMBA and CTDB with the Ceph file system. Are you using the vfs_ceph SAMBA module or are you kernel mounting the Ceph file system? Thanks Eric On Mon, May 9, 2016 at 9:31 AM, Nick Fisk <nick@xxxxxxxxxx> wrote: > Hi All, > > I've been testing an active/active Samba cluster over CephFS, performance > seems really good with small files compared to Gluster. Soft reboots work > beautifully with little to no interruption in file access. However when I > perform a hard shutdown/reboot of one of the samba nodes, the remaining node > detects that the other Samba node has disappeared but then eventually bans > itself. If I leave everything for around 5 minutes, CTDB unbans itself and > then everything continues running. > > From what I can work out it looks like as the MDS has a stale session from > the powered down node, it won't let the remaining node access the CTDB lock > file (which is also sitting the on the CephFS). CTDB meanwhile is hammering > away trying to access the lock file, but it sees what it thinks is a split > brain scenario because something still has a lock on the lockfile, and so > bans itself. > > I'm guessing the solution is to either reduce the mds session timeout or > increase the amount of time/retries for CTDB, but I'm not sure what's the > best approach. Does anyone have any ideas? > > Nick > > _______________________________________________ > ceph-users mailing list > ceph-users@xxxxxxxxxxxxxx > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com