Dear Dan and Patrick, the find didn't return anything. With this and the info below, am I right to assume that these were temporary working directories that got caught in a snapshot (we use rolling snapshots)? I would really appreciate any ideas on how to find out the original file system path of these large directories. I would like to advise the user(s) that we have a special high-performance file system for temporary data. I can't find indications of performance problems with the meta-data pool. After the re-deployment of OSDs with quadrupling the OSD count, the meta data pool seems to perform very well. The find did run over a 1.3PB file system in under 18hours. However, running this find on the root got me caught in another problem: https://lists.ceph.io/hyperkitty/list/ceph-users@xxxxxxx/thread/HKEBXXRMX5WA5Y6JFM34WFPMWTCMPFCG/#EMHNSHZIPFZZ5QYS6B4VW3LUGL6HDTOP Apparently, the meta data performance is now so high that a single client can crash an MDS daemon and even take the MDS cluster with it. Best regards, ================= Frank Schilder AIT Risø Campus Bygning 109, rum S14 ________________________________________ From: Frank Schilder Sent: 30 August 2021 16:18:02 To: ceph-users Cc: Dan van der Ster; Patrick Donnelly Subject: Re: LARGE_OMAP_OBJECTS: any proper action possible? Dear Dan and Patrick, I have the suspicion that I'm looking at large directories in the snapshots that do no longer exist any more on the file system. Hence, the omap objects are not fragmented as explained in the tracker issue. Here is the info as you asked me to pull out: > find /cephfs -type d -inum 1099738108263 The find didn't return yet. Would be great to find which user is doing that. Unfortunately, I don't believe the directory still exists. > rados -p cephfs_metadata listomapkeys 1000d7fd167.02800000 I did this on a different object: # rados listomapkeys --pool=con-fs2-meta1 1000eec35f5.01000000 | wc -l 216000 This matches with the log message. I guess these keys are file/dir names? Then yes, its a huge directory. > Please try the resolutions suggested in: https://tracker.ceph.com/issues/45333 If I understand correctly, the INODE.00000000 objects contain the path information: [root@gnosis ~]# rados listxattr --pool=con-fs2-meta1 1000eec35f5.01000000 [root@gnosis ~]# rados listxattr --pool=con-fs2-meta1 1000eec35f5.00000000 layout parent Decoding the meta info in the parent attribute gives: [root@gnosis ~]# rados getxattr --pool=con-fs2-meta1 1000eec35f5.00000000 parent | ceph-dencoder type inode_backtrace_t import - decode dump_json { "ino": 1099761989109, "ancestors": [ { "dirino": 1552, "dname": "1000eec35f5", "version": 882614706 }, { "dirino": 257, "dname": "stray6", "version": 563853824 } ], "pool": 12, "old_pools": [] } This smells a lot like a deleted directory in a snapshot, moved to one of the stray object bucket. The result is essentially the same for all large omap objects except for the stray number. Is it possible to figure out the original location in the file system path? I guess I have to increase the warning threshold or live with the warning message, neither of which is preferred. It would be great if you could help me find the original path so I can identify the user and advice him/her on how to organise his/her files. Thanks and best regards, ================= Frank Schilder AIT Risø Campus Bygning 109, rum S14 ________________________________________ From: Patrick Donnelly <pdonnell@xxxxxxxxxx> Sent: 27 August 2021 19:16:16 To: Frank Schilder Cc: ceph-users Subject: Re: LARGE_OMAP_OBJECTS: any proper action possible? Hi Frank, On Wed, Aug 25, 2021 at 6:27 AM Frank Schilder <frans@xxxxxx> wrote: > > Hi all, > > I have the notorious "LARGE_OMAP_OBJECTS: 4 large omap objects" warning and am again wondering if there is any proper action one can take except "wait it out and deep-scrub (numerous ceph-users threads)" or "ignore (https://docs.ceph.com/en/latest/rados/operations/health-checks/#large-omap-objects)". Only for RGWs is a proper action described, but mine come from MDSes. Is there any way to ask an MDS to clean up or split the objects? > > The disks with the meta-data pool can easily deal with objects of this size. My question is more along the lines: If I can't do anything anyway, why the warning? If there is a warning, I would assume that one can do something proper to prevent large omap objects from being born by an MDS. What is it? Please try the resolutions suggested in: https://tracker.ceph.com/issues/45333 -- Patrick Donnelly, Ph.D. He / Him / His Principal Software Engineer Red Hat Sunnyvale, CA GPG: 19F28A586F808C2402351B93C3301A3E258DD79D _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx