This sounds an awful lot like a a bug I've run into a few times (not often enough to get a good backtrace out of the kernel or mds) involving vim on a symlink to a file in another directory. It will occasionally corrupt the symlink in such a way that the symlink is unreadable. Filling dmesg with: [ 2368.036667] ceph: fill_inode badness on ffff8800bb5fb610 [ 2368.969657] ------------[ cut here ]------------ [ 2368.969670] WARNING: CPU: 0 PID: 15 at fs/ceph/inode.c:813 fill_inode.isra.19+0x4b1/0xa49() [ 2368.969672] Modules linked in: [ 2368.969684] CPU: 0 PID: 15 Comm: kworker/0:1 Tainted: G W 4.5.0-gentoo #1 [ 2368.969686] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.8.2-0-g33fbe13 by qemu-project.org 04/01/2014 [ 2368.969693] Workqueue: ceph-msgr ceph_con_workfn [ 2368.969695] 0000000000000286 000000007000a7b9 ffff88017e267af0 ffffffffb142ec39 [ 2368.969698] 0000000000000000 0000000000000009 ffff88017e267b28 ffffffffb1091c83 [ 2368.969700] ffffffffb13be512 ffffc900020da8cd ffff880427a30230 ffffffffffffffff [ 2368.969704] Call Trace: [ 2368.969709] [<ffffffffb142ec39>] dump_stack+0x63/0x7f [ 2368.969714] [<ffffffffb1091c83>] warn_slowpath_common+0x9a/0xb3 [ 2368.969717] [<ffffffffb13be512>] ? fill_inode.isra.19+0x4b1/0xa49 [ 2368.969719] [<ffffffffb1091d86>] warn_slowpath_null+0x15/0x17 [ 2368.969722] [<ffffffffb13be512>] fill_inode.isra.19+0x4b1/0xa49 [ 2368.969724] [<ffffffffb13bca00>] ? ceph_mount+0x729/0x72e [ 2368.969727] [<ffffffffb13bf705>] ceph_readdir_prepopulate+0x48f/0x70c [ 2368.969730] [<ffffffffb13daac3>] dispatch+0xebf/0x1428 [ 2368.969752] [<ffffffffb19098f2>] ? ceph_x_check_message_signature+0x42/0xc4 [ 2368.969756] [<ffffffffb18fa16e>] ceph_con_workfn+0xe1a/0x24f3 [ 2368.969759] [<ffffffffb104603a>] ? load_TLS+0xb/0xf [ 2368.969761] [<ffffffffb10468f9>] ? __switch_to+0x3b0/0x42b [ 2368.969765] [<ffffffffb10afd8f>] ? finish_task_switch+0xff/0x191 [ 2368.969768] [<ffffffffb10a53b3>] process_one_work+0x175/0x2a0 [ 2368.969770] [<ffffffffb10a59c8>] worker_thread+0x1fc/0x2ae [ 2368.969772] [<ffffffffb10a57cc>] ? rescuer_thread+0x2c0/0x2c0 [ 2368.969775] [<ffffffffb10a9c4b>] kthread+0xaf/0xb7 [ 2368.969777] [<ffffffffb10a9b9c>] ? kthread_parkme+0x1f/0x1f [ 2368.969780] [<ffffffffb192620f>] ret_from_fork+0x3f/0x70 [ 2368.969782] [<ffffffffb10a9b9c>] ? kthread_parkme+0x1f/0x1f [ 2368.969784] ---[ end trace b054c5c6854fd2ab ]--- [ 2368.969786] ceph: fill_inode badness on ffff880428185d70 [ 2370.289733] ------------[ cut here ]------------ [ 2370.289747] WARNING: CPU: 0 PID: 15 at fs/ceph/inode.c:813 fill_inode.isra.19+0x4b1/0xa49() [ 2370.289750] Modules linked in: [ 2370.289756] CPU: 0 PID: 15 Comm: kworker/0:1 Tainted: G W 4.5.0-gentoo #1 [ 2370.289759] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.8.2-0-g33fbe13 by qemu-project.org 04/01/2014 [ 2370.289767] Workqueue: ceph-msgr ceph_con_workfn [ 2370.289769] 0000000000000286 000000007000a7b9 ffff88017e267af0 ffffffffb142ec39 [ 2370.289774] 0000000000000000 0000000000000009 ffff88017e267b28 ffffffffb1091c83 [ 2370.289777] ffffffffb13be512 ffffc900020f58cd ffff880427a30230 ffffffffffffffff [ 2370.289781] Call Trace: [ 2370.289787] [<ffffffffb142ec39>] dump_stack+0x63/0x7f [ 2370.289793] [<ffffffffb1091c83>] warn_slowpath_common+0x9a/0xb3 [ 2370.289797] [<ffffffffb13be512>] ? fill_inode.isra.19+0x4b1/0xa49 [ 2370.289801] [<ffffffffb1091d86>] warn_slowpath_null+0x15/0x17 [ 2370.289804] [<ffffffffb13be512>] fill_inode.isra.19+0x4b1/0xa49 [ 2370.289807] [<ffffffffb13bca00>] ? ceph_mount+0x729/0x72e [ 2370.289811] [<ffffffffb13bf705>] ceph_readdir_prepopulate+0x48f/0x70c [ 2370.289815] [<ffffffffb13daac3>] dispatch+0xebf/0x1428 [ 2370.289821] [<ffffffffb19098f2>] ? ceph_x_check_message_signature+0x42/0xc4 [ 2370.289824] [<ffffffffb18fa16e>] ceph_con_workfn+0xe1a/0x24f3 [ 2370.289829] [<ffffffffb104603a>] ? load_TLS+0xb/0xf [ 2370.289832] [<ffffffffb10468f9>] ? __switch_to+0x3b0/0x42b [ 2370.289837] [<ffffffffb10afd8f>] ? finish_task_switch+0xff/0x191 [ 2370.289841] [<ffffffffb10a53b3>] process_one_work+0x175/0x2a0 [ 2370.289843] [<ffffffffb10a59c8>] worker_thread+0x1fc/0x2ae [ 2370.289846] [<ffffffffb10a57cc>] ? rescuer_thread+0x2c0/0x2c0 [ 2370.289849] [<ffffffffb10a9c4b>] kthread+0xaf/0xb7 [ 2370.289853] [<ffffffffb10a9b9c>] ? kthread_parkme+0x1f/0x1f [ 2370.289857] [<ffffffffb192620f>] ret_from_fork+0x3f/0x70 [ 2370.289860] [<ffffffffb10a9b9c>] ? kthread_parkme+0x1f/0x1f [ 2370.289863] ---[ end trace b054c5c6854fd2ac ]--- [ 2370.289865] ceph: fill_inode badness on ffff880428185d70 [ 2371.525649] ------------[ cut here ]------------ [ 2371.525663] WARNING: CPU: 0 PID: 15 at fs/ceph/inode.c:813 fill_inode.isra.19+0x4b1/0xa49() [ 2371.525665] Modules linked in: [ 2371.525670] CPU: 0 PID: 15 Comm: kworker/0:1 Tainted: G W 4.5.0-gentoo #1 [ 2371.525672] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.8.2-0-g33fbe13 by qemu-project.org 04/01/2014 [ 2371.525679] Workqueue: ceph-msgr ceph_con_workfn [ 2371.525682] 0000000000000286 000000007000a7b9 ffff88017e267af0 ffffffffb142ec39 [ 2371.525685] 0000000000000000 0000000000000009 ffff88017e267b28 ffffffffb1091c83 [ 2371.525687] ffffffffb13be512 ffffc900021108cd ffff880427a30230 ffffffffffffffff [ 2371.525690] Call Trace: [ 2371.525696] [<ffffffffb142ec39>] dump_stack+0x63/0x7f [ 2371.525701] [<ffffffffb1091c83>] warn_slowpath_common+0x9a/0xb3 [ 2371.525704] [<ffffffffb13be512>] ? fill_inode.isra.19+0x4b1/0xa49 [ 2371.525707] [<ffffffffb1091d86>] warn_slowpath_null+0x15/0x17 [ 2371.525740] [<ffffffffb13be512>] fill_inode.isra.19+0x4b1/0xa49 [ 2371.525744] [<ffffffffb13bca00>] ? ceph_mount+0x729/0x72e [ 2371.525747] [<ffffffffb13bf705>] ceph_readdir_prepopulate+0x48f/0x70c [ 2371.525751] [<ffffffffb13daac3>] dispatch+0xebf/0x1428 [ 2371.525755] [<ffffffffb19098f2>] ? ceph_x_check_message_signature+0x42/0xc4 [ 2371.525758] [<ffffffffb18fa16e>] ceph_con_workfn+0xe1a/0x24f3 [ 2371.525762] [<ffffffffb104603a>] ? load_TLS+0xb/0xf [ 2371.525764] [<ffffffffb10468f9>] ? __switch_to+0x3b0/0x42b [ 2371.525769] [<ffffffffb10afd8f>] ? finish_task_switch+0xff/0x191 [ 2371.525772] [<ffffffffb10a53b3>] process_one_work+0x175/0x2a0 [ 2371.525774] [<ffffffffb10a59c8>] worker_thread+0x1fc/0x2ae [ 2371.525776] [<ffffffffb10a57cc>] ? rescuer_thread+0x2c0/0x2c0 [ 2371.525779] [<ffffffffb10a9c4b>] kthread+0xaf/0xb7 [ 2371.525782] [<ffffffffb10a9b9c>] ? kthread_parkme+0x1f/0x1f [ 2371.525786] [<ffffffffb192620f>] ret_from_fork+0x3f/0x70 [ 2371.525788] [<ffffffffb10a9b9c>] ? kthread_parkme+0x1f/0x1f [ 2371.525790] ---[ end trace b054c5c6854fd2ad ]--- Whenever a readdir is performed on the directory containing the symlink, and all the stats go ??????? and the symlink is unable to be deleted/moved/operated on. I believe it involves the overwrites that vim performs on save (save to temporary file and move it overtop of existing, I believe). I've seen it on kernels 4.0->4.5 so far. Possibly even earlier. Hammer->Infernalis, I've not had a chance to test on Jewel. I'd dump the symlink data out of the metadata pool, but I'm still recovering from http://tracker.ceph.com/issues/16177 Not trying to hijack your thread here, though. -- Adam On Thu, Jun 16, 2016 at 4:03 PM, Jason Gress <jgress@xxxxxxxxxxxxx> wrote: > This is the latest default kernel with CentOS7. We also tried a newer > kernel (from elrepo), a 4.4 that has the same problem, so I don't think > that is it. Thank you for the suggestion though. > > We upgraded our cluster to the 10.2.2 release today, and it didn't resolve > all of the issues. It's possible that a related issue is actually > permissions. Something may not be right with our config (or a bug) here. > > While testing we noticed that there may actually be two issues here. I am > unsure, as we noticed that the most consistent way to reproduce our issue > is to use vim or sed -i which does in place renames: > > [root@ftp01 cron]# ls -la > total 3 > drwx------ 1 root root 2044 Jun 16 15:50 . > drwxr-xr-x. 10 root root 104 May 19 09:34 .. > -rw-r--r-- 1 root root 300 Jun 16 15:50 file > -rw------- 1 root root 2044 Jun 16 13:47 root > [root@ftp01 cron]# sed -i 's/^/#/' file > sed: cannot rename ./sedfB2CkO: Permission denied > > > Strangely, adding or deleting files works fine, it's only renaming that > fails. And strangely I was able to successfully edit the file on ftp02: > > [root@ftp02 cron]# sed -i 's/^/#/' file > [root@ftp02 cron]# ls -la > total 3 > drwx------ 1 root root 2044 Jun 16 15:49 . > drwxr-xr-x. 10 root root 104 May 19 09:34 .. > -rw-r--r-- 1 root root 313 Jun 16 15:49 file > -rw------- 1 root root 2044 Jun 16 13:47 root > > > Then it worked on ftp01 this time: > [root@ftp01 cron]# ls -la > total 3 > drwx------ 1 root root 2357 Jun 16 15:49 . > drwxr-xr-x. 10 root root 104 May 19 09:34 .. > -rw-r--r-- 1 root root 313 Jun 16 15:49 file > -rw------- 1 root root 2044 Jun 16 13:47 root > > > Then, I vim'd it successfully on ftp01... Then ran the sed again: > > [root@ftp01 cron]# sed -i 's/^/#/' file > sed: cannot rename ./sedfB2CkO: Permission denied > [root@ftp01 cron]# ls -la > total 3 > drwx------ 1 root root 2044 Jun 16 15:51 . > drwxr-xr-x. 10 root root 104 May 19 09:34 .. > -rw-r--r-- 1 root root 300 Jun 16 15:50 file > -rw------- 1 root root 2044 Jun 16 13:47 root > > > And now we have the zero file problem again: > > [root@ftp02 cron]# ls -la > total 2 > drwx------ 1 root root 2044 Jun 16 15:51 . > drwxr-xr-x. 10 root root 104 May 19 09:34 .. > -rw-r--r-- 1 root root 0 Jun 16 15:50 file > -rw------- 1 root root 2044 Jun 16 13:47 root > > > Anyway, I wonder how much of this issue is related to that cannot rename > issue above. Here are our security settings: > > client.ftp01 > key: <redacted> > caps: [mds] allow r, allow rw path=/ftp > caps: [mon] allow r > caps: [osd] allow rw pool=cephfs_metadata, allow rw pool=cephfs_data > client.ftp02 > key: <redacted> > caps: [mds] allow r, allow rw path=/ftp > caps: [mon] allow r > caps: [osd] allow rw pool=cephfs_metadata, allow rw pool=cephfs_data > > > /ftp is the directory on cephfs under which cron lives; the full path is > /ftp/cron . > > I hope this helps and thank you for your time! > > Jason > > On 6/15/16, 4:43 PM, "John Spray" <jspray@xxxxxxxxxx> wrote: > >>On Wed, Jun 15, 2016 at 10:21 PM, Jason Gress <jgress@xxxxxxxxxxxxx> >>wrote: >>> While trying to use CephFS as a clustered filesystem, we stumbled upon a >>> reproducible bug that is unfortunately pretty serious, as it leads to >>>data >>> loss. Here is the situation: >>> >>> We have two systems, named ftp01 and ftp02. They are both running >>>CentOS >>> 7.2, with this kernel release and ceph packages: >>> >>> kernel-3.10.0-327.18.2.el7.x86_64 >> >>That is an old-ish kernel to be using with cephfs. It may well be the >>source of your issues. >> >>> [root@ftp01 cron]# rpm -qa | grep ceph >>> ceph-base-10.2.1-0.el7.x86_64 >>> ceph-deploy-1.5.33-0.noarch >>> ceph-mon-10.2.1-0.el7.x86_64 >>> libcephfs1-10.2.1-0.el7.x86_64 >>> ceph-selinux-10.2.1-0.el7.x86_64 >>> ceph-mds-10.2.1-0.el7.x86_64 >>> ceph-common-10.2.1-0.el7.x86_64 >>> ceph-10.2.1-0.el7.x86_64 >>> python-cephfs-10.2.1-0.el7.x86_64 >>> ceph-osd-10.2.1-0.el7.x86_64 >>> >>> Mounted like so: >>> XX.XX.XX.XX:/ftp/cron /var/spool/cron ceph >>> _netdev,relatime,name=ftp01,secretfile=/etc/ceph/ftp01.secret 0 0 >>> And: >>> XX.XX.XX.XX:/ftp/cron /var/spool/cron ceph >>> _netdev,relatime,name=ftp02,secretfile=/etc/ceph/ftp02.secret 0 0 >>> >>> This filesystem has 234GB worth of data on it, and I created another >>> subdirectory and mounted it, NFS style. >>> >>> Here were the steps to reproduce: >>> >>> First, I created a file (I was mounting /var/spool/cron on two systems) >>>on >>> ftp01: >>> (crond is not running right now on either system to keep the variables >>>down) >>> >>> [root@ftp01 cron]# cp /tmp/root . >>> >>> Shows up on both fine: >>> [root@ftp01 cron]# ls -la >>> total 2 >>> drwx------ 1 root root 0 Jun 15 15:50 . >>> drwxr-xr-x. 10 root root 104 May 19 09:34 .. >>> -rw------- 1 root root 2043 Jun 15 15:50 root >>> [root@ftp01 cron]# md5sum root >>> 0636c8deaeadfea7b9ddaa29652b43ae root >>> >>> [root@ftp02 cron]# ls -la >>> total 2 >>> drwx------ 1 root root 2043 Jun 15 15:50 . >>> drwxr-xr-x. 10 root root 104 May 19 09:34 .. >>> -rw------- 1 root root 2043 Jun 15 15:50 root >>> [root@ftp02 cron]# md5sum root >>> 0636c8deaeadfea7b9ddaa29652b43ae root >>> >>> Now, I vim the file on one of them: >>> [root@ftp01 cron]# vim root >>> [root@ftp01 cron]# ls -la >>> total 2 >>> drwx------ 1 root root 0 Jun 15 15:51 . >>> drwxr-xr-x. 10 root root 104 May 19 09:34 .. >>> -rw------- 1 root root 2044 Jun 15 15:50 root >>> [root@ftp01 cron]# md5sum root >>> 7a0c346bbd2b61c5fe990bb277c00917 root >>> >>> [root@ftp02 cron]# md5sum root >>> 7a0c346bbd2b61c5fe990bb277c00917 root >>> >>> So far so good, right? Then, a few seconds later: >>> >>> [root@ftp02 cron]# ls -la >>> total 0 >>> drwx------ 1 root root 0 Jun 15 15:51 . >>> drwxr-xr-x. 10 root root 104 May 19 09:34 .. >>> -rw------- 1 root root 0 Jun 15 15:50 root >>> [root@ftp02 cron]# cat root >>> [root@ftp02 cron]# md5sum root >>> d41d8cd98f00b204e9800998ecf8427e root >>> >>> And on ftp01: >>> >>> [root@ftp01 cron]# ls -la >>> total 2 >>> drwx------ 1 root root 0 Jun 15 15:51 . >>> drwxr-xr-x. 10 root root 104 May 19 09:34 .. >>> -rw------- 1 root root 2044 Jun 15 15:50 root >>> [root@ftp01 cron]# md5sum root >>> 7a0c346bbd2b61c5fe990bb277c00917 root >>> >>> I later create a 'root2' on ftp02 and cause a similar issue. The end >>> results are two non-matching files: >>> >>> [root@ftp01 cron]# ls -la >>> total 2 >>> drwx------ 1 root root 0 Jun 15 15:53 . >>> drwxr-xr-x. 10 root root 104 May 19 09:34 .. >>> -rw------- 1 root root 2044 Jun 15 15:50 root >>> -rw-r--r-- 1 root root 0 Jun 15 15:53 root2 >>> >>> [root@ftp02 cron]# ls -la >>> total 2 >>> drwx------ 1 root root 0 Jun 15 15:53 . >>> drwxr-xr-x. 10 root root 104 May 19 09:34 .. >>> -rw------- 1 root root 0 Jun 15 15:50 root >>> -rw-r--r-- 1 root root 1503 Jun 15 15:53 root2 >>> >>> We were able to reproduce this on two other systems with the same cephfs >>> filesystem. I have also seen cases where the file would just blank out >>>on >>> both as well. >>> >>> We could not reproduce it with our dev/test cluster running the >>>development >>> ceph version: >>> >>> ceph-10.2.2-1.g502540f.el7.x86_64 >> >>Strange. In that cluster, was the same 3.x kernel in use? There >>aren't a whole lot of changes on the server side in v10.2.2 that I >>could imagine affecting this case. >> >>The best thing to do right now is to try using ceph-fuse in your >>production environment, to check that it is not exhibiting the same >>behaviour as the old kernel client. Once you confirm that, I would >>recommend upgrading your kernel to the most recent 4.x that you are >>comfortable with, and confirm that that also does not exhibit the bad >>behaviour. >> >>John >> >>> Is this a known bug with the current production Jewel release? If so, >>>will >>> it be patched in the next release? >>> >>> Thank you very much, >>> >>> Jason Gress >>> >>> "This message and any attachments may contain confidential information. >>>If >>> you >>> have received this message in error, any use or distribution is >>>prohibited. >>> Please notify us by reply e-mail if you have mistakenly received this >>> message, >>> and immediately and permanently delete it and any attachments. Thank >>>you." >>> >>> >>> _______________________________________________ >>> ceph-users mailing list >>> ceph-users@xxxxxxxxxxxxxx >>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >>> > > > > > "This message and any attachments may contain confidential information. If you > have received this message in error, any use or distribution is prohibited. > Please notify us by reply e-mail if you have mistakenly received this message, > and immediately and permanently delete it and any attachments. Thank you." > _______________________________________________ > ceph-users mailing list > ceph-users@xxxxxxxxxxxxxx > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com