Thanks for pointing that out, I must have missed it searching earlier.
I'll look forward to upgrading when 14.2.8 comes out and see if that
addresses the issue.
Seth
On 2/27/20 11:37 PM, Brad Hubbard wrote:
Check the thread titled " Frequest LARGE_OMAP_OBJECTS in
cephfs metadata pool" from a few days ago.
On Fri, Feb 28, 2020 at 9:03 AM Seth Galitzer <sgsax@xxxxxxx> wrote:
I do not have a large ceph cluster, only 4 nodes plus a mon/mgr with 48
OSDs. I have one data pool and one metadata pool with a total of about
140TB of usable storage. I have maybe 30 or so clients. The rest of my
systems connect via a host that is a ceph client and then reshares
through samba and nfs-ganesha. I'm not using rgw anywhere. I'm running
the latest stable release of nautilus (14.2.7) and have had it in
production since August 2019. All ceph nodes and the smb/nfs host are
running centos7 with latest patches. Other clients are a mix of debian
and ubuntu.
For the last several weeks, I have been getting the warning "Large omap
object found" off and on. I've been resolving it by gradually increasing
the value of osd_deep_scrub_large_omap_object_key_threshold and then
running a deep scrub on the affected pg. I have now increased this
threshold to 1000000 and am wondering if I should keep doing this or if
there is another problem that needs to be addressed.
The affected pg has been different most times, but they are all on the
same osd and with the same mds object. Here's an excerpt from my current
set of logs to show what I'm seeing:
# zgrep -i "large omap object found" /var/log/ceph/ceph.log*
/var/log/ceph/ceph.log:2020-02-27 06:02:01.761641 osd.40 (osd.40) 1578 :
cluster [WRN] Large omap object found. Object:
2:654134d2:::mds0_openfiles.0:head PG: 2.4b2c82a6 (2.26) Key count:
1048576 Size (bytes): 46403355
/var/log/ceph/ceph.log:2020-02-27 16:18:00.328869 osd.40 (osd.40) 1585 :
cluster [WRN] Large omap object found. Object:
2:654134d2:::mds0_openfiles.0:head PG: 2.4b2c82a6 (2.26) Key count:
1048559 Size (bytes): 46407183
/var/log/ceph/ceph.log-20200227.gz:2020-02-26 19:56:24.972431 osd.40
(osd.40) 1450 : cluster [WRN] Large omap object found. Object:
2:c9647462:::mds0_openfiles.1:head PG: 2.462e2693 (2.13) Key count:
939236 Size (bytes): 40179994
/var/log/ceph/ceph.log-20200227.gz:2020-02-26 21:14:16.497161 osd.40
(osd.40) 1460 : cluster [WRN] Large omap object found. Object:
2:c9647462:::mds0_openfiles.1:head PG: 2.462e2693 (2.13) Key count:
939232 Size (bytes): 40179796
/var/log/ceph/ceph.log-20200227.gz:2020-02-26 21:15:06.399267 osd.40
(osd.40) 1464 : cluster [WRN] Large omap object found. Object:
2:c9647462:::mds0_openfiles.1:head PG: 2.462e2693 (2.13) Key count:
939231 Size (bytes): 40179756
Unfortunately, older logs have already been rotated out, but if memory
serves correctly, they had similar messages. As you can see, the key
count continues to increase. Last week, I bumped the threshold to 750000
to clear the warning. Before that, I had bumped to 500000. It looks to
me like something isn't getting cleaned up like it's supposed to. I
haven't been using ceph long enough to figure out what that might be.
Do I continue to bump the key threshold and not worry about the
warnings, or is there something going on that needs to be corrected? At
what point is the threshold too high? If the problem is due to a
specific client not closing files, is it possible to identify that
client and attempt to reset it?
Any advice is welcome. I'm happy to provide additional data if needed.
Thanks.
Seth
--
Seth Galitzer
Systems Coordinator
Computer Science Department
Kansas State University
http://www.cs.ksu.edu/~sgsax
sgsax@xxxxxxx
785-532-7790
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx
--
Seth Galitzer
Systems Coordinator
Computer Science Department
Kansas State University
http://www.cs.ksu.edu/~sgsax
sgsax@xxxxxxx
785-532-7790
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx