Re: Migration Nautilus to Pacifi : Very high latencies (EC profile)

Wesley Dillingham <wes@xxxxxxxxxxxxxxxxx> · Mon, 16 May 2022 11:20:33 -0400

In our case it appears that file deletes have a very high impact on osd
operations. Not a significant delete either ~20T on a 1PB utilized
filesystem (large files as well).

We are trying to tune down cephfs delayed deletes via:
    "mds_max_purge_ops": "512",
    "mds_max_purge_ops_per_pg": "0.100000",

with some success but still experimenting with how we can reduce the
throughput impact from osd slow ops.

Respectfully,

*Wes Dillingham*
wes@xxxxxxxxxxxxxxxxx
LinkedIn <http://www.linkedin.com/in/wesleydillingham>

On Mon, May 16, 2022 at 9:49 AM Wesley Dillingham <wes@xxxxxxxxxxxxxxxxx>
wrote:

> We have a newly-built pacific (16.2.7) cluster running 8+3 EC jerasure
> ~250 OSDS across 21 hosts which has significantly lower than expected IOPS.
> Only doing about 30 IOPS per spinning disk (with appropriately sized SSD
> bluestore db) around ~100 PGs per OSD. Have around 100 CephFS (ceph fuse
> 16.2.7) clients using the cluster. Cluster regularly reports slow ops from
> the OSDs but the vast majority, 90% plus of the OSDs, are only <50% IOPS
> utilized. Plenty of cpu/ram/network left on all cluster nodes. We have
> looked for hardware (disk/bond/network/mce) issues across the cluster with
> no findings / checked send-qs and received-q's across the cluster to try
> and narrow in on an individual failing component but nothing found there.
> Slow ops are also spread equally across the servers in the cluster. Does
> your cluster report any health warnings (slow ops etc) alongside your
> reduced performance?
>
> Respectfully,
>
> *Wes Dillingham*
> wes@xxxxxxxxxxxxxxxxx
> LinkedIn <http://www.linkedin.com/in/wesleydillingham>
>
>
> On Mon, May 16, 2022 at 2:00 AM Martin Verges <martin.verges@xxxxxxxx>
> wrote:
>
>> Hello,
>>
>> depending on your workload, drives and OSD allocation size, using the 3+2
>> can be way slower than the 4+2. Maybe give it a small benchmark and try if
>> you see a huge difference. We had some benchmarks with such and they
>> showed
>> quite ugly results in some tests. Best way to deploy EC in our findings is
>> in power of 2, like 2+x, 4+x, 8+x, 16+x. Especially when you deploy OSDs
>> before the Ceph allocation change patch, you might end up consuming way
>> more space if you don't use power of 2. With the 4k allocation size at
>> least this has been greatly improved for newer deployed OSDs.
>>
>> --
>> Martin Verges
>> Managing director
>>
>> Mobile: +49 174 9335695  | Chat: https://t.me/MartinVerges
>>
>> croit GmbH, Freseniusstr. 31h, 81247 Munich
>> CEO: Martin Verges - VAT-ID: DE310638492
>> Com. register: Amtsgericht Munich HRB 231263
>> Web: https://croit.io | YouTube: https://goo.gl/PGE1Bx
>>
>>
>> On Sun, 15 May 2022 at 20:30, stéphane chalansonnet <schalans@xxxxxxxxx>
>> wrote:
>>
>> > Hi,
>> >
>> > Thank you for your answer.
>> > this is not a good news if you also notice a performance decrease on
>> your
>> > side
>> > No, as far as we know, you cannot downgrade to Octopus.
>> > Going forward seems to be the only way, so Quincy .
>> > We have a a qualification cluster so we can try on it (but full virtual
>> > configuration)
>> >
>> >
>> > We are using 4+2 and 3+2 profile
>> > Are you also on the same profile on your Cluster ?
>> > Maybe replicated profile are not be impacted ?
>> >
>> > Actually, we are trying to recreate one by one the OSD.
>> > some parameters can be only set by this way .
>> > The first storage Node is almost rebuild, we will see if the latencies
>> on
>> > it are below the others ...
>> >
>> > Wait and see .....
>> >
>> > Le dim. 15 mai 2022 à 10:16, Martin Verges <martin.verges@xxxxxxxx> a
>> > écrit :
>> >
>> >> Hello,
>> >>
>> >> what exact EC level do you use?
>> >>
>> >> I can confirm, that our internal data shows a performance drop when
>> using
>> >> pacific. So far Octopus is faster and better than pacific but I doubt
>> you
>> >> can roll back to it. We haven't rerun our benchmarks on Quincy yet, but
>> >> according to some presentation it should be faster than pacific. Maybe
>> try
>> >> to jump away from the pacific release into the unknown!
>> >>
>> >> --
>> >> Martin Verges
>> >> Managing director
>> >>
>> >> Mobile: +49 174 9335695  | Chat: https://t.me/MartinVerges
>> >>
>> >> croit GmbH, Freseniusstr. 31h, 81247 Munich
>> >> CEO: Martin Verges - VAT-ID: DE310638492
>> >> Com. register: Amtsgericht Munich HRB 231263
>> >> Web: https://croit.io | YouTube: https://goo.gl/PGE1Bx
>> >>
>> >>
>> >> On Sat, 14 May 2022 at 12:27, stéphane chalansonnet <
>> schalans@xxxxxxxxx>
>> >> wrote:
>> >>
>> >>> Hello,
>> >>>
>> >>> After a successful update from Nautilus to Pacific on Centos8.5, we
>> >>> observed some high latencies on our cluster.
>> >>>
>> >>> We did not find very much thing on community related to latencies post
>> >>> migration
>> >>>
>> >>> Our setup is
>> >>> 6x storage Node (256GRAM, 2SSD OSD + 5*6To SATA HDD)
>> >>> Erasure coding profile
>> >>> We have two EC pool :
>> >>> -> Pool1 : Full HDD SAS Drive 6To
>> >>> -> Pool2 : Full SSD Drive
>> >>>
>> >>> Object S3 and RBD block workload
>> >>>
>> >>> Our performances in nautilus, before the upgrade , are acceptable.
>> >>> However , the next day , performance dropped by 3 or 4
>> >>> Benchmark showed 15KIOPS on flash drive , before upgrade we had
>> >>> almost 80KIOPS
>> >>> Also, HDD pool is almost down (too much lantencies
>> >>>
>> >>> We suspected , maybe, an impact on erasure Coding configuration on
>> >>> Pacific
>> >>> Anyone observed the same behaviour ? any tuning ?
>> >>>
>> >>> Thank you for your help.
>> >>>
>> >>> ceph osd tree
>> >>> ID   CLASS  WEIGHT     TYPE NAME                 STATUS  REWEIGHT
>> >>> PRI-AFF
>> >>>  -1         347.61304  root default
>> >>>  -3          56.71570      host cnp31tcephosd01
>> >>>   0    hdd    5.63399          osd.0                 up   1.00000
>> >>> 1.00000
>> >>>   1    hdd    5.63399          osd.1                 up   1.00000
>> >>> 1.00000
>> >>>   2    hdd    5.63399          osd.2                 up   1.00000
>> >>> 1.00000
>> >>>   3    hdd    5.63399          osd.3                 up   1.00000
>> >>> 1.00000
>> >>>   4    hdd    5.63399          osd.4                 up   1.00000
>> >>> 1.00000
>> >>>   5    hdd    5.63399          osd.5                 up   1.00000
>> >>> 1.00000
>> >>>   6    hdd    5.63399          osd.6                 up   1.00000
>> >>> 1.00000
>> >>>   7    hdd    5.63399          osd.7                 up   1.00000
>> >>> 1.00000
>> >>>  40    ssd    5.82190          osd.40                up   1.00000
>> >>> 1.00000
>> >>>  48    ssd    5.82190          osd.48                up   1.00000
>> >>> 1.00000
>> >>>  -5          56.71570      host cnp31tcephosd02
>> >>>   8    hdd    5.63399          osd.8                 up   1.00000
>> >>> 1.00000
>> >>>   9    hdd    5.63399          osd.9               down   1.00000
>> >>> 1.00000
>> >>>  10    hdd    5.63399          osd.10                up   1.00000
>> >>> 1.00000
>> >>>  11    hdd    5.63399          osd.11                up   1.00000
>> >>> 1.00000
>> >>>  12    hdd    5.63399          osd.12                up   1.00000
>> >>> 1.00000
>> >>>  13    hdd    5.63399          osd.13                up   1.00000
>> >>> 1.00000
>> >>>  14    hdd    5.63399          osd.14                up   1.00000
>> >>> 1.00000
>> >>>  15    hdd    5.63399          osd.15                up   1.00000
>> >>> 1.00000
>> >>>  49    ssd    5.82190          osd.49                up   1.00000
>> >>> 1.00000
>> >>>  50    ssd    5.82190          osd.50                up   1.00000
>> >>> 1.00000
>> >>>  -7          56.71570      host cnp31tcephosd03
>> >>>  16    hdd    5.63399          osd.16                up   1.00000
>> >>> 1.00000
>> >>>  17    hdd    5.63399          osd.17                up   1.00000
>> >>> 1.00000
>> >>>  18    hdd    5.63399          osd.18                up   1.00000
>> >>> 1.00000
>> >>>  19    hdd    5.63399          osd.19                up   1.00000
>> >>> 1.00000
>> >>>  20    hdd    5.63399          osd.20                up   1.00000
>> >>> 1.00000
>> >>>  21    hdd    5.63399          osd.21                up   1.00000
>> >>> 1.00000
>> >>>  22    hdd    5.63399          osd.22                up   1.00000
>> >>> 1.00000
>> >>>  23    hdd    5.63399          osd.23                up   1.00000
>> >>> 1.00000
>> >>>  51    ssd    5.82190          osd.51                up   1.00000
>> >>> 1.00000
>> >>>  52    ssd    5.82190          osd.52                up   1.00000
>> >>> 1.00000
>> >>>  -9          56.71570      host cnp31tcephosd04
>> >>>  24    hdd    5.63399          osd.24                up   1.00000
>> >>> 1.00000
>> >>>  25    hdd    5.63399          osd.25                up   1.00000
>> >>> 1.00000
>> >>>  26    hdd    5.63399          osd.26                up   1.00000
>> >>> 1.00000
>> >>>  27    hdd    5.63399          osd.27                up   1.00000
>> >>> 1.00000
>> >>>  28    hdd    5.63399          osd.28                up   1.00000
>> >>> 1.00000
>> >>>  29    hdd    5.63399          osd.29                up   1.00000
>> >>> 1.00000
>> >>>  30    hdd    5.63399          osd.30                up   1.00000
>> >>> 1.00000
>> >>>  31    hdd    5.63399          osd.31                up   1.00000
>> >>> 1.00000
>> >>>  53    ssd    5.82190          osd.53                up   1.00000
>> >>> 1.00000
>> >>>  54    ssd    5.82190          osd.54                up   1.00000
>> >>> 1.00000
>> >>> -11          56.71570      host cnp31tcephosd05
>> >>>  32    hdd    5.63399          osd.32                up   1.00000
>> >>> 1.00000
>> >>>  33    hdd    5.63399          osd.33                up   1.00000
>> >>> 1.00000
>> >>>  34    hdd    5.63399          osd.34                up   1.00000
>> >>> 1.00000
>> >>>  35    hdd    5.63399          osd.35                up   1.00000
>> >>> 1.00000
>> >>>  36    hdd    5.63399          osd.36                up   1.00000
>> >>> 1.00000
>> >>>  37    hdd    5.63399          osd.37                up   1.00000
>> >>> 1.00000
>> >>>  38    hdd    5.63399          osd.38                up   1.00000
>> >>> 1.00000
>> >>>  39    hdd    5.63399          osd.39                up   1.00000
>> >>> 1.00000
>> >>>  55    ssd    5.82190          osd.55                up   1.00000
>> >>> 1.00000
>> >>>  56    ssd    5.82190          osd.56                up   1.00000
>> >>> 1.00000
>> >>> -13          64.03453      host cnp31tcephosd06
>> >>>  41    hdd    7.48439          osd.41                up   1.00000
>> >>> 1.00000
>> >>>  42    hdd    7.48439          osd.42                up   1.00000
>> >>> 1.00000
>> >>>  43    hdd    7.48439          osd.43                up   1.00000
>> >>> 1.00000
>> >>>  44    hdd    7.48439          osd.44                up   1.00000
>> >>> 1.00000
>> >>>  45    hdd    7.48439          osd.45                up   1.00000
>> >>> 1.00000
>> >>>  46    hdd    7.48439          osd.46                up   1.00000
>> >>> 1.00000
>> >>>  47    hdd    7.48439          osd.47                up   1.00000
>> >>> 1.00000
>> >>>  57    ssd    5.82190          osd.57                up   1.00000
>> >>> 1.00000
>> >>>  58    ssd    5.82190          osd.58                up   1.00000
>> >>> 1.00000
>> >>> _______________________________________________
>> >>> ceph-users mailing list -- ceph-users@xxxxxxx
>> >>> To unsubscribe send an email to ceph-users-leave@xxxxxxx
>> >>>
>> >>
>> _______________________________________________
>> ceph-users mailing list -- ceph-users@xxxxxxx
>> To unsubscribe send an email to ceph-users-leave@xxxxxxx
>>
>
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx