Hi Gregory and Xiubo, we have a smoking gun. The error shows up when using python's shutil.copy function. It affects newer versions of python3. Here some test results (quoted e-mail from our user): > I now have a minimal example that reproduces the error: > > echo test > myfile.txt > ml Python/3.10.4-GCCcore-11.3.0-bare > python -c "import shutil; shutil.copy('myfile.txt','myfile2.txt')" > ls -l | grep myfile > srun --partition fatq --time 01:00:00 --nodes 1 --exclusive --pty bash > ls -l | grep myfile > > The problem occurs when I copy a file via python (Python/3.10.4-GCCcore-11.3.0-bare) > The problems does not occur when running the script with the default python > > [user1@host1 ~]$ echo test > myfile.txt > [user1@host1 ~]$ ml Python/3.10.4-GCCcore-11.3.0-bare > [user1@host1 ~]$ python -c "import shutil; shutil.copy('myfile.txt','myfile2.txt')" > [user1@host1 ~]$ ls -l | grep myfile > -rw-rw---- 1 user1 user1 5 Nov 3 07:23 myfile2.txt > -rw-rw---- 1 user1 user1 5 Nov 3 07:23 myfile.txt > [user1@host1 ~]$ srun --partition fatq --time 01:00:00 --nodes 1 --exclusive --pty bash > [user1@host3 ~]$ ls -l | grep myfile > -rw-rw---- 1 user1 user1 0 Nov 3 07:23 myfile2.txt > -rw-rw---- 1 user1 user1 5 Nov 3 07:23 myfile.txt > Just tried with some other python versions > > Ok > Python/3.6.4-intel-2018a > Python/3.7.4-GCCcore-8.3.0 > > Fail > Python/3.8.2-GCCcore-9.3.0 > Python/3.9.5-GCCcore-10.3.0-bare > Python/3.9.5-GCCcore-10.3.0 > Python/3.10.4-GCCcore-11.3.0-bare > Python/3.10.4-GCCcore-11.3.0 > Python/3.10.8-GCCcore-12.2.0-bare These are easybuild python modules using different gcc versions to build. The default version of python referred to is Python 2.7.5. Is this a known problem with python3 and is there a patch we can apply? I wonder how python manages to break the file system so consistently. Thanks and best regards, ================= Frank Schilder AIT Risø Campus Bygning 109, rum S14 ________________________________________ From: Frank Schilder <frans@xxxxxx> Sent: Thursday, November 2, 2023 12:12 PM To: Gregory Farnum; Xiubo Li Cc: ceph-users@xxxxxxx Subject: Re: ceph fs (meta) data inconsistent Hi all, the problem re-appeared in the following way. After moving the problematic folder out and copying it back, all files showed the correct sizes. Today, we observe that the issue is back in the copy that was fine yesterday: [user1@host11 h2lib]$ ls -l total 37198 -rw-rw---- 1 user1 user1 11641 Nov 2 11:32 dll_wrapper.py [user2@host2 h2lib]# ls -l total 44 -rw-rw----. 1 user1 user1 0 Nov 2 11:28 dll_wrapper.py It is correct in the snapshot though: [user2@host2 h2lib]# cd .snap [user2@host2 .snap]# ls _2023-11-02_000611+0100_daily_1 [user2@host2 .snap]# cd _2023-11-02_000611+0100_daily_1/ [user2@host2 _2023-11-02_000611+0100_daily_1]# ls -l total 37188 -rw-rw----. 1 user1 user1 11641 Nov 1 13:30 dll_wrapper.py It seems related to the path, bot the inode number. Could the re-appearance of the 0 length have been triggered by taking the snapshot? We plan to reboot later today the server where the file was written. Until hen we can do diagnostics while the issue is visible. Please let us know what information we can provide. Best regards, ================= Frank Schilder AIT Risø Campus Bygning 109, rum S14 ________________________________________ From: Gregory Farnum <gfarnum@xxxxxxxxxx> Sent: Wednesday, November 1, 2023 4:57 PM To: Frank Schilder; Xiubo Li Cc: ceph-users@xxxxxxx Subject: Re: ceph fs (meta) data inconsistent We have seen issues like this a few times and they have all been kernel client bugs with CephFS’ internal “capability” file locking protocol. I’m not aware of any extant bugs like this in our code base, but kernel patches can take a long and winding path before they end up on deployed systems. Most likely, if you were to restart some combination of the client which wrote the file and the client(s) reading it, the size would propagate correctly. As long as you’ve synced the data, it’s definitely present in the cluster. Adding Xiubo, who has worked on these and may have other comments. -Greg On Wed, Nov 1, 2023 at 7:16 AM Frank Schilder <frans@xxxxxx<mailto:frans@xxxxxx>> wrote: Dear fellow cephers, today we observed a somewhat worrisome inconsistency on our ceph fs. A file created on one host showed up as 0 length on all other hosts: [user1@host1 h2lib]$ ls -lh total 37M -rw-rw---- 1 user1 user1 12K Nov 1 11:59 dll_wrapper.py [user2@host2 h2lib]# ls -l total 34 -rw-rw----. 1 user1 user1 0 Nov 1 11:59 dll_wrapper.py [user1@host1 h2lib]$ cp dll_wrapper.py dll_wrapper.py.test [user1@host1 h2lib]$ ls -l total 37199 -rw-rw---- 1 user1 user1 11641 Nov 1 11:59 dll_wrapper.py -rw-rw---- 1 user1 user1 11641 Nov 1 13:10 dll_wrapper.py.test [user2@host2 h2lib]# ls -l total 45 -rw-rw----. 1 user1 user1 0 Nov 1 11:59 dll_wrapper.py -rw-rw----. 1 user1 user1 11641 Nov 1 13:10 dll_wrapper.py.test Executing a sync on all these hosts did not help. However, deleting the problematic file and replacing it with a copy seemed to work around the issue. We saw this with ceph kclients of different versions, it seems to be on the MDS side. How can this happen and how dangerous is it? ceph fs status (showing ceph version): # ceph fs status con-fs2 - 1662 clients ======= RANK STATE MDS ACTIVITY DNS INOS 0 active ceph-15 Reqs: 14 /s 2307k 2278k 1 active ceph-11 Reqs: 159 /s 4208k 4203k 2 active ceph-17 Reqs: 3 /s 4533k 4501k 3 active ceph-24 Reqs: 3 /s 4593k 4300k 4 active ceph-14 Reqs: 1 /s 4228k 4226k 5 active ceph-13 Reqs: 5 /s 1994k 1782k 6 active ceph-16 Reqs: 8 /s 5022k 4841k 7 active ceph-23 Reqs: 9 /s 4140k 4116k POOL TYPE USED AVAIL con-fs2-meta1 metadata 2177G 7085G con-fs2-meta2 data 0 7085G con-fs2-data data 1242T 4233T con-fs2-data-ec-ssd data 706G 22.1T con-fs2-data2 data 3409T 3848T STANDBY MDS ceph-10 ceph-08 ceph-09 ceph-12 MDS version: ceph version 15.2.17 (8a82819d84cf884bd39c17e3236e0632ac146dc4) octopus (stable) There is no health issue: # ceph status cluster: id: abc health: HEALTH_WARN 3 pgs not deep-scrubbed in time services: mon: 5 daemons, quorum ceph-01,ceph-02,ceph-03,ceph-25,ceph-26 (age 9w) mgr: ceph-25(active, since 7w), standbys: ceph-26, ceph-01, ceph-03, ceph-02 mds: con-fs2:8 4 up:standby 8 up:active osd: 1284 osds: 1279 up (since 2d), 1279 in (since 5d) task status: data: pools: 14 pools, 25065 pgs objects: 2.20G objects, 3.9 PiB usage: 4.9 PiB used, 8.2 PiB / 13 PiB avail pgs: 25039 active+clean 26 active+clean+scrubbing+deep io: client: 799 MiB/s rd, 55 MiB/s wr, 3.12k op/s rd, 1.82k op/s wr The inconsistency seems undiagnosed, I couldn't find anything interesting in the cluster log. What should I look for and where? I moved the folder to another location for diagnosis. Unfortunately, I don't have 2 clients any more showing different numbers, I see a 0 length now everywhere for the moved folder. I'm pretty sure though that the file still is non-zero length. Thanks for any pointers. ================= Frank Schilder AIT Risø Campus Bygning 109, rum S14 _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx<mailto:ceph-users@xxxxxxx> To unsubscribe send an email to ceph-users-leave@xxxxxxx<mailto:ceph-users-leave@xxxxxxx> _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx