Re: LARGE_OMAP_OBJECTS: any proper action possible?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Can I steal this thread just for 1 answer 😊 ?

I have 11 omap in my octopus cluster related to datalog like this:

/var/log/ceph/ceph.log-20210822.gz:2021-08-21T09:06:20.605200+0700 osd.11 (osd.11) 1876 : cluster [WRN] Large omap object found. Object: 22:b040fc05:::data_log.31:head PG: 22.a03f020d (22.d) Key count: 436895 Size (bytes): 91129945

Based on my understanding datalog is for replication purposes. I'm using replication but not for the buckets that is listed when list the omap values.

What else can be in the datalog? Is it safe to delete? OR what is the safest way? The autotrim didn't do much thing. Also osd, pg scrub/deep scrub either.

Thank you.


-----Original Message-----
From: Frank Schilder [mailto:frans@xxxxxx] 
Sent: Tuesday, August 31, 2021 9:27 PM
To: Dan van der Ster <dan@xxxxxxxxxxxxxx>
Cc: Patrick Donnelly <pdonnell@xxxxxxxxxx>; ceph-users <ceph-users@xxxxxxx>
Subject:  Re: LARGE_OMAP_OBJECTS: any proper action possible?

Email received from the internet. If in doubt, don't click any link nor open any attachment !
________________________________

Hi Dan,

unfortunately, the file/directory names were generated like one would do for temporary files. No clue about their location. I would need to find such a file while it exists. Of course, I could execute a find on the snapshot ...

Just kidding. The large omap count is going down already, the first 4 are probably purged from the snapshots.

Thanks and best regards,
=================
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14

________________________________________
From: Dan van der Ster <dan@xxxxxxxxxxxxxx>
Sent: 31 August 2021 15:44:41
To: Frank Schilder
Cc: Patrick Donnelly; ceph-users
Subject: Re:  LARGE_OMAP_OBJECTS: any proper action possible?

Hi,

I don't know how to find a full path from a dir object.
But perhaps you can make an educated guess based on what you see in:

rados listomapkeys --pool=con-fs2-meta1 1000eec35f5.01000000 | head -n 100

Those should be the directory entries. (s/_head//)

-- Dan

On Tue, Aug 31, 2021 at 2:31 PM Frank Schilder <frans@xxxxxx> wrote:
>
> Dear Dan and Patrick,
>
> the find didn't return anything. With this and the info below, am I right to assume that these were temporary working directories that got caught in a snapshot (we use rolling snapshots)?
>
> I would really appreciate any ideas on how to find out the original file system path of these large directories. I would like to advise the user(s) that we have a special high-performance file system for temporary data.
>
> I can't find indications of performance problems with the meta-data pool. After the re-deployment of OSDs with quadrupling the OSD count, the meta data pool seems to perform very well. The find did run over a 1.3PB file system in under 18hours.
>
> However, running this find on the root got me caught in another 
> problem: 
> https://lists.ceph.io/hyperkitty/list/ceph-users@xxxxxxx/thread/HKEBXX
> RMX5WA5Y6JFM34WFPMWTCMPFCG/#EMHNSHZIPFZZ5QYS6B4VW3LUGL6HDTOP
>
> Apparently, the meta data performance is now so high that a single client can crash an MDS daemon and even take the MDS cluster with it.
>
> Best regards,
> =================
> Frank Schilder
> AIT Risø Campus
> Bygning 109, rum S14
>
> ________________________________________
> From: Frank Schilder
> Sent: 30 August 2021 16:18:02
> To: ceph-users
> Cc: Dan van der Ster; Patrick Donnelly
> Subject: Re:  LARGE_OMAP_OBJECTS: any proper action possible?
>
> Dear Dan and Patrick,
>
> I have the suspicion that I'm looking at large directories in the snapshots that do no longer exist any more on the file system. Hence, the omap objects are not fragmented as explained in the tracker issue. Here is the info as you asked me to pull out:
>
> > find /cephfs -type d -inum 1099738108263
>
> The find didn't return yet. Would be great to find which user is doing that. Unfortunately, I don't believe the directory still exists.
>
> > rados -p cephfs_metadata listomapkeys 1000d7fd167.02800000
>
> I did this on a different object:
>
> # rados listomapkeys --pool=con-fs2-meta1 1000eec35f5.01000000 | wc -l
> 216000
>
> This matches with the log message. I guess these keys are file/dir names? Then yes, its a huge directory.
>
> > Please try the resolutions suggested in: 
> > https://tracker.ceph.com/issues/45333
>
> If I understand correctly, the INODE.00000000 objects contain the path information:
>
> [root@gnosis ~]# rados listxattr --pool=con-fs2-meta1 
> 1000eec35f5.01000000 [root@gnosis ~]# rados listxattr 
> --pool=con-fs2-meta1 1000eec35f5.00000000 layout parent
>
> Decoding the meta info in the parent attribute gives:
>
> [root@gnosis ~]# rados getxattr --pool=con-fs2-meta1 
> 1000eec35f5.00000000 parent | ceph-dencoder type inode_backtrace_t import - decode dump_json {
>     "ino": 1099761989109,
>     "ancestors": [
>         {
>             "dirino": 1552,
>             "dname": "1000eec35f5",
>             "version": 882614706
>         },
>         {
>             "dirino": 257,
>             "dname": "stray6",
>             "version": 563853824
>         }
>     ],
>     "pool": 12,
>     "old_pools": []
> }
>
> This smells a lot like a deleted directory in a snapshot, moved to one of the stray object bucket. The result is essentially the same for all large omap objects except for the stray number. Is it possible to figure out the original location in the file system path?
>
> I guess I have to increase the warning threshold or live with the warning message, neither of which is preferred. It would be great if you could help me find the original path so I can identify the user and advice him/her on how to organise his/her files.
>
> Thanks and best regards,
> =================
> Frank Schilder
> AIT Risø Campus
> Bygning 109, rum S14
>
> ________________________________________
> From: Patrick Donnelly <pdonnell@xxxxxxxxxx>
> Sent: 27 August 2021 19:16:16
> To: Frank Schilder
> Cc: ceph-users
> Subject: Re:  LARGE_OMAP_OBJECTS: any proper action possible?
>
> Hi Frank,
>
> On Wed, Aug 25, 2021 at 6:27 AM Frank Schilder <frans@xxxxxx> wrote:
> >
> > Hi all,
> >
> > I have the notorious "LARGE_OMAP_OBJECTS: 4 large omap objects" warning and am again wondering if there is any proper action one can take except "wait it out and deep-scrub (numerous ceph-users threads)" or "ignore (https://docs.ceph.com/en/latest/rados/operations/health-checks/#large-omap-objects)". Only for RGWs is a proper action described, but mine come from MDSes. Is there any way to ask an MDS to clean up or split the objects?
> >
> > The disks with the meta-data pool can easily deal with objects of this size. My question is more along the lines: If I can't do anything anyway, why the warning? If there is a warning, I would assume that one can do something proper to prevent large omap objects from being born by an MDS. What is it?
>
> Please try the resolutions suggested in: 
> https://tracker.ceph.com/issues/45333
>
> --
> Patrick Donnelly, Ph.D.
> He / Him / His
> Principal Software Engineer
> Red Hat Sunnyvale, CA
> GPG: 19F28A586F808C2402351B93C3301A3E258DD79D
>
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux