On Sun, Jan 16, 2022 at 8:28 PM Frank Schilder <frans@xxxxxx> wrote: > > I seem to have a problem. I cannot dump the mds tree: > > [root@ceph-08 ~]# ceph daemon mds.ceph-08 dump tree '~mdsdir/stray0' > root inode is not in cache > [root@ceph-08 ~]# ceph daemon mds.ceph-08 dump tree '~mds0/stray0' > root inode is not in cache > [root@ceph-08 ~]# ceph daemon mds.ceph-08 dump tree '~mds0' 0 > root inode is not in cache > [root@ceph-08 ~]# ceph daemon mds.ceph-08 dump tree '~mdsdir' 0 > root inode is not in cache > > [root@ceph-08 ~]# ceph daemon mds.ceph-08 get subtrees | grep path > "path": "", > "path": "~mds0", > > Any idea what I can do? This was fixed recently: https://github.com/ceph/ceph/pull/44313 > > Thanks! > ================= > Frank Schilder > AIT Risø Campus > Bygning 109, rum S14 > > ________________________________________ > From: Frank Schilder > Sent: 16 January 2022 14:16:16 > To: 胡 玮文; Dan van der Ster > Cc: ceph-users > Subject: Re: [Warning Possible spam] Re: cephfs: [ERR] loaded dup inode > > That looks great! I think we suffer from the same issue. I will try it out. I assume running the script on a read-only mount will be enough? > > Best regards, > ================= > Frank Schilder > AIT Risø Campus > Bygning 109, rum S14 > > ________________________________________ > From: 胡 玮文 <huww98@xxxxxxxxxxx> > Sent: 14 January 2022 17:22:35 > To: Frank Schilder; Dan van der Ster > Cc: ceph-users > Subject: [Warning Possible spam] Re: cephfs: [ERR] loaded dup inode > > Hi Frank, > > I just studied the exact same issue that conda generates a lot of strays. And I created a Python script[1] to trigger reintegration efficiently. > > This script invokes cephfs python binding and do not rely on the kernel ceph client. And should also bypass your sssd. > This script works by reading the “stray_prior_path” extracted from MDS, and guess all possible path that this file might be linked to according to the logic of conda. > > And if you still want to use shell, I have tested that `find /path/to/conda -printf '%n\n'` is enough to trigger the reintegration. But that is still too slow for us. > > Feel free to contact me for more info. > > Weiwen Hu > > [1]: https://gist.github.com/huww98/91cbff0782ad4f6673dcffccce731c05 > > 发件人: Frank Schilder<mailto:frans@xxxxxx> > 发送时间: 2022年1月14日 20:04 > 收件人: Dan van der Ster<mailto:dan@xxxxxxxxxxxxxx> > 抄送: ceph-users<mailto:ceph-users@xxxxxxx> > 主题: Re: cephfs: [ERR] loaded dup inode > > Hi Dan, > > thanks a lot! I will try this. We have lost of users using lots of hard-links (for example, python anaconda packages create thousands of them). > > Is there a command that forces "reintegration" without having to stat the file? "ls -lR" will stat the file and this is very slow as we use sssd with AD for user IDs. What operation is required to trigger a re-integration? I could probably run a find with suitable arguments. > > Thanks a lot for any hints. > > Best regards, > ================= > Frank Schilder > AIT Risø Campus > Bygning 109, rum S14 > > ________________________________________ > From: Dan van der Ster <dan@xxxxxxxxxxxxxx> > Sent: 14 January 2022 12:30:51 > To: Frank Schilder > Cc: ceph-users > Subject: Re: Re: cephfs: [ERR] loaded dup inode > > Hi Frank, > > We had this long ago related to a user generating lots of hard links. > Snapshots will have a similar effect. > (in these cases, if a user deletes the original file, the file goes > into stray until it is "reintegrated"). > > If you can find the dir where they're working, `ls -lR` will force > those to reintegrate (you will see because the num strays will drop > back down). > You might have to ls -lR in a snap directory, or in the current tree > -- you have to browse around and experiment. > > pacific does this re-integration automatically. > > -- dan > > On Fri, Jan 14, 2022 at 12:24 PM Frank Schilder <frans@xxxxxx> wrote: > > > > Hi Venky, > > > > thanks for your reply. I think the first type of messages was a race condition. A user was running rm and find on the same folder at the same time. The second type of message (duplicate inode in stray) might point to an a bit more severe issue. For a while now I observe that ".mds_cache.num_strays" is really large and, on average, constantly increasing: > > > > # ssh ceph-08 'ceph daemon mds.$(hostname -s) perf dump | jq .mds_cache.num_strays' > > 1081531 > > > > This is by no means justified by people deleting files. Our snapshots rotate completely every 3 days and the stray buckets should get purged regularly. I have 2 questions: > > > > 1) Would a "cephfs-data-scan scan_links" detect and potentially resolve this problem (orphaned inodes in stray bucket)? > > 2) For a file system of our size, how long would a "cephfs-data-scan scan_links" run approximately (I need to estimate downtime)? I think I can execute up to 35-40 workers. The fs size is: > > > > ceph.dir.rbytes="2078289930815425" > > ceph.dir.rentries="278320382" > > > > Thanks for your help! > > > > Best regards, > > ================= > > Frank Schilder > > AIT Risø Campus > > Bygning 109, rum S14 > > > > ________________________________________ > > From: Venky Shankar <vshankar@xxxxxxxxxx> > > Sent: 12 January 2022 12:24 > > To: Frank Schilder > > Cc: ceph-users > > Subject: Re: cephfs: [ERR] loaded dup inode > > > > On Tue, Jan 11, 2022 at 6:07 PM Frank Schilder <frans@xxxxxx> wrote: > > > > > > Hi all, > > > > > > I found a bunch of error messages like below in our ceph log (2 different types). How bad is this and should I do something? > > > > > > Ceph version is 13.2.10 (564bdc4ae87418a232fc901524470e1a0f76d641) mimic (stable). > > > > > > 2022-01-11 11:49:47.687010 [ERR] loaded dup inode 0x10011bac31c [4f8,head] v1046724308 at ~mds0/stray1/10011bac31c, but inode 0x10011bac31c.head v1046760378 already exists at [...]/miniconda3/envs/ffpy_gwa3/lib/python3.6/site-packages/python_dateutil-2.8.0.dist-info/INSTALLER > > > > > > 2022-01-11 11:49:47.682346 [ERR] loaded dup inode 0x10011bac7fc [4f8,head] v1046725418 at ~mds0/stray1/10011bac7fc, but inode 0x10011bac7fc.head v1046760674 already exists at ~mds0/stray2/10011bac7fc > > > > I've seen this earlier. Not sure how we end up with an inode in two > > stray directories, but it doesn't look serious. > > > > You could try stopping all MDSs and run `cephfs-data-scan scan_links` > > (courtesy Zheng) to see if the errors go away. > > > > > > > > Best regards, > > > ================= > > > Frank Schilder > > > AIT Risø Campus > > > Bygning 109, rum S14 > > > _______________________________________________ > > > ceph-users mailing list -- ceph-users@xxxxxxx > > > To unsubscribe send an email to ceph-users-leave@xxxxxxx > > > > > > > > > -- > > Cheers, > > Venky > > > > _______________________________________________ > > ceph-users mailing list -- ceph-users@xxxxxxx > > To unsubscribe send an email to ceph-users-leave@xxxxxxx > _______________________________________________ > ceph-users mailing list -- ceph-users@xxxxxxx > To unsubscribe send an email to ceph-users-leave@xxxxxxx > > _______________________________________________ > ceph-users mailing list -- ceph-users@xxxxxxx > To unsubscribe send an email to ceph-users-leave@xxxxxxx -- Cheers, Venky _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx