Is there any way I can unset pglog_hardlimit from osdmap? I see the release note about this flag that it could not be unset but I don't get why because the only diff when this flag is on is in timing pg logs in aggressive mode and I don't get why if I unset this flag anything might be hurt? The way I want to unset is to decompile osdmap remove this flag and compile it again and set it to Ceph. On Mon, Jul 19, 2021 at 12:04 AM Seena Fallah <seenafallah@xxxxxxxxx> wrote: > I don't think it's a pool based config and in my cluster, it's set on > osdmap level flags. The pool I test in the higher latency cluster that has > much lower latency had 18 pgs and the higher latency pool has 8212 pgs. > The higher latency cluster has this flag the lower one doesn't have. > > On Sun, Jul 18, 2021 at 11:57 PM Brett Niver <bniver@xxxxxxxxxx> wrote: > >> Seena, >> >> Which pool has the hardlimit flag set, the lower latency one, or the >> higher? >> Brett >> >> >> On Sun, Jul 18, 2021 at 12:17 PM Seena Fallah <seenafallah@xxxxxxxxx> >> wrote: >> >>> I've checked out my logs and see there is pg log trimming on each op and >>> it's in aggressive mode. I checked the osdmap flags and see there is a >>> pglog_hardlimit flag set in it, but the other cluster doesn't have. >>> Should I tune any config related to this flag in v12.2.13? >>> I've seen this PR (https://github.com/ceph/ceph/pull/20394) that is not >>> backported to the luminus. Could this help? >>> >>> On Sun, Jul 18, 2021 at 12:09 AM Seena Fallah <seenafallah@xxxxxxxxx> >>> wrote: >>> >>> > I've trimmed pg log on all OSDs and whoops (!) latency came from 100ms >>> to >>> > 20ms! But based on the other cluster I think it should come to around >>> 7ms. >>> > Is there anything related to pg log or other things that can help to >>> > continue debugging? >>> > >>> > On Thu, Jul 15, 2021 at 3:13 PM Seena Fallah <seenafallah@xxxxxxxxx> >>> > wrote: >>> > >>> >> Hi, >>> >> >>> >> I'm facing something strange in ceph (v12.2.13, filestore). I have two >>> >> clusters with the same config (kernel, network, disks, ...). One of >>> them >>> >> has 3ms latency the other has 100ms latency. Both physical disk >>> latency on >>> >> write is less than 1ms. >>> >> In the cluster with 100ms latency on write when I create another pool >>> >> with the same configs (crush rule, replica, ...) and test the >>> latency, it >>> >> would like my another cluster. So it seems there is a problem in one >>> of my >>> >> pools! >>> >> The pool has 8212 PGs and each PG is around 12GB with 844 objects. >>> Also, >>> >> I have many removed_snaps in this pool and I don't know if it impacts >>> >> performance or not? >>> >> >>> >> Do you have any idea what is wrong with my pool? Is there any way to >>> >> debug this problem? >>> >> >>> >> Thanks. >>> >> >>> > >>> _______________________________________________ >>> ceph-users mailing list -- ceph-users@xxxxxxx >>> To unsubscribe send an email to ceph-users-leave@xxxxxxx >>> >>> _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx