Re: CephFS + CTDB/Samba - MDS session timeout on lockfile

Eric Eastman <eric.eastman@xxxxxxxxxxxxxx> · Mon, 9 May 2016 12:21:25 -0600

I am trying to do some similar testing with SAMBA and CTDB with the
Ceph file system.  Are you using the vfs_ceph SAMBA module or are you
kernel mounting the Ceph file system?

Thanks
Eric

On Mon, May 9, 2016 at 9:31 AM, Nick Fisk <nick@xxxxxxxxxx> wrote:
> Hi All,
>
> I've been testing an active/active Samba cluster over CephFS, performance
> seems really good with small files compared to Gluster. Soft reboots work
> beautifully with little to no interruption in file access. However when I
> perform a hard shutdown/reboot of one of the samba nodes, the remaining node
> detects that the other Samba node has disappeared but then eventually bans
> itself. If I leave everything for around 5 minutes, CTDB unbans itself and
> then everything continues running.
>
> From what I can work out it looks like as the MDS has a stale session from
> the powered down node, it won't let the remaining node access the CTDB lock
> file (which is also sitting the on the CephFS). CTDB meanwhile is hammering
> away trying to access the lock file, but it sees what it thinks is a split
> brain scenario because something still has a lock on the lockfile, and so
> bans itself.
>
> I'm guessing the solution is to either reduce the mds session timeout or
> increase the amount of time/retries for CTDB, but I'm not sure what's the
> best approach. Does anyone have any ideas?
>
> Nick
>
> _______________________________________________
> ceph-users mailing list
> ceph-users@xxxxxxxxxxxxxx
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com