> On Sep 13, 2016, at 07:56, Gregory Farnum <gfarnum@xxxxxxxxxx> wrote: > > On Wed, Sep 7, 2016 at 11:35 PM, Xusangdi <xu.sangdi@xxxxxxx> wrote: >> Hi Cephers, >> >> We encountered a problem when using CephFS + Samba, which fails the reconnection phase of MDS respawn. >> Reproduce steps: >> 1. kernel mount CephFS to a Samba server >> 2. re-export the mount point by Samba >> 3. connect to Samba server from a Windows 7 client, and copy a large file (4GB) to the shared directory >> 4. during copy process, restart the active (and the only one) MDS >> 5. MDS then gives up reconnecting to the kernel client after timeout >> As a result, all client requests will hang for like forever :< >> >> I did a few extra tests, which proved that this issue will not occur when using kernel client directly nor via >> NFS re-export. From the syslog I found the following error (with dynamic debug enabled): >> >> Sep 6 20:34:41 trusty81 kernel: [465858.676638] ceph: mds0 caps stale >> Sep 6 20:34:41 trusty81 kernel: [465859.123780] ceph: mds0 reconnect start >> Sep 6 20:34:41 trusty81 kernel: [465859.125113] ceph: session ffff8801121f7000 state reconnecting >> Sep 6 20:34:41 trusty81 kernel: [465859.126306] ceph: counted 0 flock locks and 0 fcntl locks >> Sep 6 20:34:41 trusty81 kernel: [465859.126349] ceph: encoding 0 flock and 0 fcntl locksceph: counted 1 flock locks and 0 fcntl locks >> Sep 6 20:34:41 trusty81 kernel: [465859.128575] ceph: encoding 1 flock and 0 fcntl locksceph: Have unknown lock type 32 >> Sep 6 20:34:41 trusty81 kernel: [465859.129795] ceph: error -22 preparing reconnect for mds0 >> >> It looks like the CIFS workload generates an invalid lock type, but I’m not sure about this. Any suggestions? > > That's pretty weird. Looks to me like it's just reading data out of > the inode passed in, and that's somehow corrupted. Zheng, do you have > any idea? CIFS uses mandatory flock, which ceph does not support. the check in ceph_flock() is buddy. Fixed by https://github.com/ceph/ceph-client/commit/77309a116cbee5a3a29ccd63f8d80c127180d923 Regards Yan, Zheng > -Greg > >> >> PS: >> 1. Samba version: 4.3.9, kernel version: 3.19.0-25-generic >> 2. I also tried a newer kernel (4.4.0-31-generic), but with no luck >> Feb 11 11:41:52 xerus101 kernel: [ 836.960441] ceph: mds0 reconnect start >> Feb 11 11:41:52 xerus101 kernel: [ 836.960494] ceph: error -22 preparing reconnect for mds0 >> >> Regards, >> ---Sandy >> >> ------------------------------------------------------------------------------------------------------------------------------------- >> 本邮件及其附件含有杭州华三通信技术有限公司的保密信息,仅限于发送给上面地址中列出 >> 的个人或群组。禁止任何其他人以任何形式使用(包括但不限于全部或部分地泄露、复制、 >> 或散发)本邮件中的信息。如果您错收了本邮件,请您立即电话或邮件通知发件人并删除本 >> 邮件! >> This e-mail and its attachments contain confidential information from H3C, which is >> intended only for the person or entity whose address is listed above. Any use of the >> information contained herein in any way (including, but not limited to, total or partial >> disclosure, reproduction, or dissemination) by persons other than the intended >> recipient(s) is prohibited. If you receive this e-mail in error, please notify the sender >> by phone or email immediately and delete it! -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html