Re: Pool Latency

Seena Fallah <seenafallah@xxxxxxxxx> · Sun, 18 Jul 2021 20:46:25 +0430

I've checked out my logs and see there is pg log trimming on each op and
it's in aggressive mode. I checked the osdmap flags and see there is a
pglog_hardlimit flag set in it, but the other cluster doesn't have.
Should I tune any config related to this flag in v12.2.13?
I've seen this PR (https://github.com/ceph/ceph/pull/20394) that is not
backported to the luminus. Could this help?

On Sun, Jul 18, 2021 at 12:09 AM Seena Fallah <seenafallah@xxxxxxxxx> wrote:

> I've trimmed pg log on all OSDs and whoops (!) latency came from 100ms to
> 20ms! But based on the other cluster I think it should come to around 7ms.
> Is there anything related to pg log or other things that can help to
> continue debugging?
>
> On Thu, Jul 15, 2021 at 3:13 PM Seena Fallah <seenafallah@xxxxxxxxx>
> wrote:
>
>> Hi,
>>
>> I'm facing something strange in ceph (v12.2.13, filestore). I have two
>> clusters with the same config (kernel, network, disks, ...). One of them
>> has 3ms latency the other has 100ms latency. Both physical disk latency on
>> write is less than 1ms.
>> In the cluster with 100ms latency on write when I create another pool
>> with the same configs (crush rule, replica, ...) and test the latency, it
>> would like my another cluster. So it seems there is a problem in one of my
>> pools!
>> The pool has 8212 PGs and each PG is around 12GB with 844 objects. Also,
>> I have many removed_snaps in this pool and I don't know if it impacts
>> performance or not?
>>
>> Do you have any idea what is wrong with my pool? Is there any way to
>> debug this problem?
>>
>> Thanks.
>>
>
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx