> -----Original Message----- > From: Eric Eastman [mailto:eric.eastman@xxxxxxxxxxxxxx] > Sent: 09 May 2016 23:09 > To: Nick Fisk <nick@xxxxxxxxxx> > Cc: Ceph Users <ceph-users@xxxxxxxxxxxxxx> > Subject: Re: CephFS + CTDB/Samba - MDS session timeout on > lockfile > > On Mon, May 9, 2016 at 3:28 PM, Nick Fisk <nick@xxxxxxxxxx> wrote: > > Hi Eric, > > > >> > >> I am trying to do some similar testing with SAMBA and CTDB with the > >> Ceph file system. Are you using the vfs_ceph SAMBA module or are you > >> kernel mounting the Ceph file system? > > > > I'm using the kernel client. I couldn't find any up to date information on if > the vfs plugin supported all the necessary bits and pieces. > > > > How is your testing coming along? I would be very interested in any > findings you may have come across. > > > > Nick > > I am also using CephFS kernel mounts, with 4 SAMBA gateways. When from a > SAMBA client, I write a large file (about 2GB) to a gateway that is not the > holder of the CTDB lock file, and then kill that gateway server during the > write, the IP failover works as expected, and in most cases the file ends up > being the correct size after the new server finishes writing it, but the data is > corrupt. The data in the file, from the point of the failover, is all zeros. > > I thought the issue may be with the kernel mount, so I looked into using the > SAMBA vfs_ceph module, but I need SAMBA with AD support and the > current vfs_ceph module, even in the SAMBA git master version, is lacking > ACL support for CephFS, as the vfs_ceph.c patches summited to the SAMBA > mail list are not yet available. See: > https://lists.samba.org/archive/samba-technical/2016-March/113063.html > > I tried using a FUSE mount of the CephFS, and it also fails setting ACLs. See: > http://tracker.ceph.com/issues/15783. > > My current status is IP failover is working, but I am seeing data corruption on > writes to the share when using kernel mounts. I am also seeing the issue you > reported when I kill the system holding the CTDB lock file. Are you verifying > your data after each failover? I must admit you are slightly ahead of me. I was initially trying to just get hard/soft failover working correctly. But your response has prompted me to test out the scenario you mentioned. I'm seeing slightly different results, my copy seems to error out when I do a node failover. I'm copying an ISO from a 2008 server to the CTDB/Samba share and when I reboot the active node, the copy pauses for a couple of seconds and then comes up with the error box. Clicking try again several times doesn't let it resume. I need to do a bit more digging to try and work out why this is happening. The share itself does seem to be in a working state when trying to click the try again button, so there is probably some sort of state/session problem. Do you have multiple vip's configured on your cluster or just a single IP? I have just the one at the moment. > > Eric _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com