Still trying with no success: Sage and Ronnie: I've tried the ping_pong tool, even with "locking=no" in my smb.conf (no differences) # ping_pong /mnt/ceph/samba-cluster/test 3 I have about 180 locks/second If I start the same command from the other node, the tools stops completely. 0 locks/second Sage, when I start the CTDB service, the mds log says every second: 2013-03-29 16:49:34.442437 7f33fe6f3700 0 mds.0.server handle_client_file_setlock: start: 0, length: 0, client: 5475, pid: 14795, type: 4 2013-03-29 16:49:35.440856 7f33fe6f3700 0 mds.0.server handle_client_file_setlock: start: 0, length: 0, client: 5475, pid: 14799, type: 4 Exactly as you see it: with a blank line in between When i start the ping_pong command i have these lines at the same rate reported by the script (180 lines/second): 2013-03-29 17:07:50.277003 7f33fe6f3700 0 mds.0.server handle_client_file_setlock: start: 2, length: 1, client: 5481, pid: 11011, type: 2 2013-03-29 17:07:50.281279 7f33fe6f3700 0 mds.0.server handle_client_file_setlock: start: 1, length: 1, client: 5481, pid: 11011, type: 4 2013-03-29 17:07:50.286643 7f33fe6f3700 0 mds.0.server handle_client_file_setlock: start: 0, length: 1, client: 5481, pid: 11011, type: 2 Finally, I've tried to lower the ctdb's RecoverBanPeriod but the clients was unable to recover for 5 minutes (again!) So, I've found the mds logging this: 2013-03-29 16:55:23.354854 7f33fc4ed700 0 log [INF] : closing stale session client.5475 192.168.130.11:0/580042840 after 300.159862 I hope to find a solution. I am at your disposal to further investigation -- Marco Aroldi 2013/3/29 ronnie sahlberg <ronniesahlberg@xxxxxxxxx>: > The ctdb package comes with a tool "ping pong" that is used to test > and exercise fcntl() locking. > > I think a good test is using this tool and then randomly powercycling > nodes in your fs cluster > making sure that > 1, fcntl() locking is still coherent and correct > 2, always recover within 20 seconds for a single node power cycle > > > That is probably a good test for CIFS serving. > > > On Thu, Mar 28, 2013 at 6:22 PM, ronnie sahlberg > <ronniesahlberg@xxxxxxxxx> wrote: >> On Thu, Mar 28, 2013 at 6:09 PM, Sage Weil <sage@xxxxxxxxxxx> wrote: >>> On Thu, 28 Mar 2013, ronnie sahlberg wrote: >>>> Disable the recovery lock file from ctdb completely. >>>> And disable fcntl locking from samba. >>>> >>>> To be blunt, unless your cluster filesystem is called GPFS, >>>> locking is probably completely broken and should be avoided. >>> >>> Ha! >>> >>>> On Thu, Mar 28, 2013 at 8:46 AM, Marco Aroldi <marco.aroldi@xxxxxxxxx> wrote: >>>> > Thanks for the answer, >>>> > >>>> > I haven't yet looked at the samba.git clone, sorry. I will. >>>> > >>>> > Just a quick report on my test environment: >>>> > * cephfs mounted with kernel driver re-exported from 2 samba nodes >>>> > * If "node B" goes down, everything works like a charm: "node A" does >>>> > ip takeover and bring up the "node B"'s ip >>>> > * Instead, if "node A" goes down, "node B" can't take the rlock file >>>> > and gives this error: >>>> > >>>> > ctdb_recovery_lock: Failed to get recovery lock on >>>> > '/mnt/ceph/samba-cluster/rlock' >>>> > Unable to get recovery lock - aborting recovery and ban ourself for 300 seconds >>>> > >>>> > * So, for 5 minutes, neither "node A" nor "node B" are active. After >>>> > that, the cluster recover correctly. >>>> > It seems that one of the 2 nodes "owns" and don't want to "release" >>>> > the rlock file >>> >>> Cephfs aims to give you coherent access between nodes. The cost of that >>> is that if another client goes down and it holds some lease/lock, you have >>> to wait for it to time out. That is supposed to happen after 60 seconds, >>> it sounds like you've hit a bug here. The flock/fnctl locks aren't >>> super-well tested in the failure scenarios. >>> >>> Even assuming it were working, though, I'm not sure that you want to wait >>> the 60 seconds either for the CTDB's to take over for each other. >> >> You do not want to wait 60 seconds. That is approaching territory where >> CIFS clients will start causing file corruption and dataloss due to >> them dropping writeback caches. >> >> You probably want to aim to try to guarantee that fcntl() locking >> start working again after >> ~20 seconds or so to have some headroom. >> >> >> Microsoft themself state 25seconds as the absolute deadline they >> require you guarantee before they will qualify storage. >> That is among other things to accomodate and have some headroom for >> some really nasty dataloss issues that will >> happen if storage can not recover quickly enough. >> >> >> CIFS is hard realtime. And you will pay dearly for missing the deadline. >> >> >> regards >> ronnie sahlberg _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com