Re: Pool Latency

Seena Fallah <seenafallah@xxxxxxxxx> · Mon, 19 Jul 2021 00:04:59 +0430

I don't think it's a pool based config and in my cluster, it's set on
osdmap level flags. The pool I test in the higher latency cluster that has
much lower latency had 18 pgs and the higher latency pool has 8212 pgs.
The higher latency cluster has this flag the lower one doesn't have.

On Sun, Jul 18, 2021 at 11:57 PM Brett Niver <bniver@xxxxxxxxxx> wrote:

> Seena,
>
> Which pool has the hardlimit flag set, the lower latency one, or the
> higher?
> Brett
>
>
> On Sun, Jul 18, 2021 at 12:17 PM Seena Fallah <seenafallah@xxxxxxxxx>
> wrote:
>
>> I've checked out my logs and see there is pg log trimming on each op and
>> it's in aggressive mode. I checked the osdmap flags and see there is a
>> pglog_hardlimit flag set in it, but the other cluster doesn't have.
>> Should I tune any config related to this flag in v12.2.13?
>> I've seen this PR (https://github.com/ceph/ceph/pull/20394) that is not
>> backported to the luminus. Could this help?
>>
>> On Sun, Jul 18, 2021 at 12:09 AM Seena Fallah <seenafallah@xxxxxxxxx>
>> wrote:
>>
>> > I've trimmed pg log on all OSDs and whoops (!) latency came from 100ms
>> to
>> > 20ms! But based on the other cluster I think it should come to around
>> 7ms.
>> > Is there anything related to pg log or other things that can help to
>> > continue debugging?
>> >
>> > On Thu, Jul 15, 2021 at 3:13 PM Seena Fallah <seenafallah@xxxxxxxxx>
>> > wrote:
>> >
>> >> Hi,
>> >>
>> >> I'm facing something strange in ceph (v12.2.13, filestore). I have two
>> >> clusters with the same config (kernel, network, disks, ...). One of
>> them
>> >> has 3ms latency the other has 100ms latency. Both physical disk
>> latency on
>> >> write is less than 1ms.
>> >> In the cluster with 100ms latency on write when I create another pool
>> >> with the same configs (crush rule, replica, ...) and test the latency,
>> it
>> >> would like my another cluster. So it seems there is a problem in one
>> of my
>> >> pools!
>> >> The pool has 8212 PGs and each PG is around 12GB with 844 objects.
>> Also,
>> >> I have many removed_snaps in this pool and I don't know if it impacts
>> >> performance or not?
>> >>
>> >> Do you have any idea what is wrong with my pool? Is there any way to
>> >> debug this problem?
>> >>
>> >> Thanks.
>> >>
>> >
>> _______________________________________________
>> ceph-users mailing list -- ceph-users@xxxxxxx
>> To unsubscribe send an email to ceph-users-leave@xxxxxxx
>>
>>
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx