Re: Ceph OSD reported Slow operations

Zakhar Kirpichenko <zakhar@xxxxxxxxx> · Sat, 4 Nov 2023 12:59:09 +0200

You have an IOPS budget, i.e. how much I/O your spinners can deliver. Space
utilization doesn't affect it much.

You can try disabling write (not read!) cache on your HDDs with sdparm (for
example, sdparm -c WCE /dev/bla); in my experience this allows HDDs to
deliver 50-100% more write IOPS. If there is lots of free RAM on the OSD
nodes, you can play with osd_memory_target and bluestore_cache_size_hdd OSD
options; be careful though: depending on you workload, the performance
impact may be insignificant, but your OSDs may run out of memory.

/z

On Sat, 4 Nov 2023 at 12:04, V A Prabha <prabhav@xxxxxxx> wrote:

> Now in this situation how can stabilize my production setup as you have
> mentioned the cluster is very busy.
> Is there any configuration parameter tuning will help or the only option
> is to reduce the applications running on the cluster.
> Though if I have free available storage of 1.6 TB free in each of my OSD,
> that will not help in my IOPS issue right?
> Please guide me
>
> On November 2, 2023 at 12:47 PM Zakhar Kirpichenko <zakhar@xxxxxxxxx>
> wrote:
>
> >1. The calculated IOPS is for the rw operation right ?
>
> Total drive IOPS, read or write. Depending on the exact drive models, it
> may be lower or higher than 200. I took the average for a smaller sized
> 7.2k rpm SAS drive. Modern drives usually deliver lower read IOPS and
> higher write IOPS.
>
> >2. Cluster is very busy? Is there any misconfiguration or missing tuning
> paramater that makes the cluster busy?
>
> You have almost 3k IOPS and your OSDs report slow ops. I'd say the cluster
> is busy, as in loaded with I/O, perhaps more I/O than it can handle well.
>
> >3. Nodes are not balanced?  you mean to say that the count of OSDs in
> each server differs. But we have enabled autoscale and optimal distribution
> so that you can see from the output of ceph osd df tree that is count of
> pgs(45/OSD) and use% (65 to 67%). Is that not significant?
>
> Yes, the OSD count differs. This means that the CPU, memory usage, network
> load and latency differ per node and may cause performance variations,
> depending on your workload.
>
> /Z
>
> On Thu, 2 Nov 2023 at 08:18, V A Prabha < prabhav@xxxxxxx> wrote:
>
> Thanks for your prompt reply ..
> But the query is
> 1.The calculated IOPS is for the rw operation right ?
> 2. Cluster is very busy? Is there any misconfiguration or missing tuning
> paramater that makes the cluster busy?
> 3. Nodes are not balanced?  you mean to say that the count of OSDs in each
> server differs. But we have enabled autoscale and optimal distribution so
> that you can see from the output of ceph osd df tree that is count of
> pgs(45/OSD) and use% (65 to 67%). Is that not significant?
> Correct me if my queries are irrelevant
>
>
>
> On November 2, 2023 at 11:36 AM Zakhar Kirpichenko < zakhar@xxxxxxxxx>
> wrote:
>
> Sure, it's 36 OSDs at 200 IOPS each (tops, likely lower), I assume size=3
> replication so 1/3 of the total performance, and some 30%-ish OSD
> overhead.
>
> (36 x 200) * 1/3 * 0.7 = 1680. That's how many IOPS you can realistically
> expect from your cluster. You get more than that, but the cluster is very
> busy and OSDs aren't coping.
>
> Also your nodes are not balanced.
>
> /Z
>
> On Thu, 2 Nov 2023 at 07:33, V A Prabha < prabhav@xxxxxxx> wrote:
>
> Can you please elaborate your identifications and the statement .
>
>
> On November 2, 2023 at 9:40 AM Zakhar Kirpichenko < zakhar@xxxxxxxxx>
> wrote:
>
> I'm afraid you're simply hitting the I/O limits of your disks.
>
> /Z
>
> On Thu, 2 Nov 2023 at 03:40, V A Prabha < prabhav@xxxxxxx> wrote:
>
>  Hi Eugen
>  Please find the details below
>
>
> root@meghdootctr1:/var/log/ceph# ceph -s
> cluster:
> id: c59da971-57d1-43bd-b2b7-865d392412a5
> health: HEALTH_WARN
> nodeep-scrub flag(s) set
> 544 pgs not deep-scrubbed in time
>
> services:
> mon: 3 daemons, quorum meghdootctr1,meghdootctr2,meghdootctr3 (age 5d)
> mgr: meghdootctr1(active, since 5d), standbys: meghdootctr2, meghdootctr3
> mds: 3 up:standby
> osd: 36 osds: 36 up (since 34h), 36 in (since 34h)
> flags nodeep-scrub
>
> data:
> pools: 2 pools, 544 pgs
> objects: 10.14M objects, 39 TiB
> usage: 116 TiB used, 63 TiB / 179 TiB avail
> pgs: 544 active+clean
>
> io:
> client: 24 MiB/s rd, 16 MiB/s wr, 2.02k op/s rd, 907 op/s wr
>
>
> Ceph Versions:
>
> root@meghdootctr1:/var/log/ceph# ceph --version
> ceph version 14.2.16 (762032d6f509d5e7ee7dc008d80fe9c87086603c) nautilus
> (stable)
>
> Ceph df -h
> https://pastebin.com/1ffucyJg
>
> Ceph OSD performance dump
> https://pastebin.com/1R6YQksE
>
> Ceph tell osd.XX bench  (Out of 36 osds only 8 OSDs give High IOPS value
> of 250
> +. Out of that 4 OSDs are from HP 3PAR and 4 OSDS from DELL EMC. We are
> using
> only 4 OSDs from HP3 par and it is working fine without any latency and
> iops
> issues from the beginning but the remaining 32 OSDs are from DELL EMC in
> which 4
> OSDs are much better than the remaining 28 OSDs)
>
> https://pastebin.com/CixaQmBi
>
> Please help me to identify if the issue is with the DELL EMC Storage, Ceph
> configuration parameter tuning or the Overload in the cloud setup
>
>
>
> On November 1, 2023 at 9:48 PM Eugen Block < eblock@xxxxxx> wrote:
> > Hi,
> >
> > for starters please add more cluster details like 'ceph status', 'ceph
> > versions', 'ceph osd df tree'. Increasing the to 10G was the right
> > thing to do, you don't get far with 1G with real cluster load. How are
> > the OSDs configured (HDD only, SSD only or HDD with rocksdb on SSD)?
> > How is the disk utilization?
> >
> > Regards,
> > Eugen
> >
> > Zitat von prabhav@xxxxxxx:
> >
> > > In a production setup of 36 OSDs( SAS disks) totalling 180 TB
> > > allocated to a single Ceph Cluster with 3 monitors and 3 managers.
> > > There were 830 volumes and VMs created in Openstack with Ceph as a
> > > backend. On Sep 21, users reported slowness in accessing the VMs.
> > > Analysing the logs lead us to problem with SAS , Network congestion
> > > and Ceph configuration( as all default values were used). We updated
> > > the Network from 1Gbps to 10Gbps for public and cluster networking.
> > > There was no change.
> > > The ceph benchmark performance showed that 28 OSDs out of 36 OSDs
> > > reported very low IOPS of 30 to 50 while the remaining showed 300+
> > > IOPS.
> > > We gradually started reducing the load on the ceph cluster and now
> > > the volumes count is 650. Now the slow operations has gradually
> > > reduced but I am aware that this is not the solution.
> > > Ceph configuration is updated with increasing the
> > > osd_journal_size to 10 GB,
> > > osd_max_backfills = 1
> > > osd_recovery_max_active = 1
> > > osd_recovery_op_priority = 1
> > > bluestore_cache_trim_max_skip_pinned=10000
> > >
> > > After one month, now we faced another issue with Mgr daemon stopped
> > > in all 3 quorums and 16 OSDs went down. From the
> > > ceph-mon,ceph-mgr.log could not get the reason. Please guide me as
> > > its a production setup
> > > _______________________________________________
> > > ceph-users mailing list -- ceph-users@xxxxxxx
> > > To unsubscribe send an email to ceph-users-leave@xxxxxxx
> >
> >
> > _______________________________________________
> > ceph-users mailing list -- ceph-users@xxxxxxx
> > To unsubscribe send an email to ceph-users-leave@xxxxxxx
> Thanks & Regards,
> Ms V A Prabha / श्रीमती प्रभा वी ए
> Joint Director / संयुक्त निदेशक
> Centre for Development of Advanced Computing(C-DAC) / प्रगत संगणन विकास
> केन्द्र(सी-डैक)
> Tidel Park”, 8th Floor, “D” Block, (North &South) / “टाइडल पार्क”,8वीं
> मंजिल,
> “डी” ब्लॉक, (उत्तर और दक्षिण)
> No.4, Rajiv Gandhi Salai / नं.4, राजीव गांधी सलाई
> Taramani / तारामणि
> Chennai / चेन्नई – 600113
> Ph.No.:044-22542226/27
> Fax No.: 044-22542294
> ------------------------------------------------------------------------------------------------------------
>
> [ C-DAC is on Social-Media too. Kindly follow us at:
> Facebook: https://www.facebook.com/CDACINDIA & Twitter: @cdacindia ]
>
> This e-mail is for the sole use of the intended recipient(s) and may
> contain confidential and privileged information. If you are not the
> intended recipient, please contact the sender by reply e-mail and destroy
> all copies and the original message. Any unauthorized review, use,
> disclosure, dissemination, forwarding, printing or copying of this email
> is strictly prohibited and appropriate legal action will be taken.
> ------------------------------------------------------------------------------------------------------------
>
>
> _______________________________________________
> ceph-users mailing list -- ceph-users@xxxxxxx
> To unsubscribe send an email to ceph-users-leave@xxxxxxx
>
>
>
> Thanks & Regards,
> Ms V A Prabha / श्रीमती प्रभा वी ए
> Joint Director / संयुक्त निदेशक
> Centre for Development of Advanced Computing(C-DAC) / प्रगत संगणन विकास
> केन्द्र(सी-डैक)
> Tidel Park”, 8th Floor, “D” Block, (North &South) / “टाइडल पार्क”,8वीं
> मंजिल, “डी” ब्लॉक, (उत्तर और दक्षिण)
> No.4, Rajiv Gandhi Salai / नं.4, राजीव गांधी सलाई
> Taramani / तारामणि
> Chennai / चेन्नई – 600113
> Ph.No.:044-22542226/27
> Fax No.: 044-22542294
>
> ------------------------------------------------------------------------------------------------------------
>
> [ C-DAC is on Social-Media too. Kindly follow us at:
> Facebook: https://www.facebook.com/CDACINDIA & Twitter: @cdacindia ]
>
> This e-mail is for the sole use of the intended recipient(s) and may
> contain confidential and privileged information. If you are not the
> intended recipient, please contact the sender by reply e-mail and destroy
> all copies and the original message. Any unauthorized review, use,
> disclosure, dissemination, forwarding, printing or copying of this email
> is strictly prohibited and appropriate legal action will be taken.
> ------------------------------------------------------------------------------------------------------------
>
>
>
>
> Thanks & Regards,
> Ms V A Prabha / श्रीमती प्रभा वी ए
> Joint Director / संयुक्त निदेशक
> Centre for Development of Advanced Computing(C-DAC) / प्रगत संगणन विकास
> केन्द्र(सी-डैक)
> Tidel Park”, 8th Floor, “D” Block, (North &South) / “टाइडल पार्क”,8वीं
> मंजिल, “डी” ब्लॉक, (उत्तर और दक्षिण)
> No.4, Rajiv Gandhi Salai / नं.4, राजीव गांधी सलाई
> Taramani / तारामणि
> Chennai / चेन्नई – 600113
> Ph.No.:044-22542226/27
> Fax No.: 044-22542294
>
> ------------------------------------------------------------------------------------------------------------
>
> [ C-DAC is on Social-Media too. Kindly follow us at:
> Facebook: https://www.facebook.com/CDACINDIA & Twitter: @cdacindia ]
>
> This e-mail is for the sole use of the intended recipient(s) and may
> contain confidential and privileged information. If you are not the
> intended recipient, please contact the sender by reply e-mail and destroy
> all copies and the original message. Any unauthorized review, use,
> disclosure, dissemination, forwarding, printing or copying of this email
> is strictly prohibited and appropriate legal action will be taken.
> ------------------------------------------------------------------------------------------------------------
>
>
>
>
> Thanks & Regards,
> Ms V A Prabha / श्रीमती प्रभा वी ए
> Joint Director / संयुक्त निदेशक
> Centre for Development of Advanced Computing(C-DAC) / प्रगत संगणन विकास
> केन्द्र(सी-डैक)
> Tidel Park”, 8th Floor, “D” Block, (North &South) / “टाइडल पार्क”,8वीं
> मंजिल, “डी” ब्लॉक, (उत्तर और दक्षिण)
> No.4, Rajiv Gandhi Salai / नं.4, राजीव गांधी सलाई
> Taramani / तारामणि
> Chennai / चेन्नई – 600113
> Ph.No.:044-22542226/27
> Fax No.: 044-22542294
>
> ------------------------------------------------------------------------------------------------------------
>
> [ C-DAC is on Social-Media too. Kindly follow us at:
> Facebook: https://www.facebook.com/CDACINDIA & Twitter: @cdacindia ]
>
> This e-mail is for the sole use of the intended recipient(s) and may
> contain confidential and privileged information. If you are not the
> intended recipient, please contact the sender by reply e-mail and destroy
> all copies and the original message. Any unauthorized review, use,
> disclosure, dissemination, forwarding, printing or copying of this email
> is strictly prohibited and appropriate legal action will be taken.
> ------------------------------------------------------------------------------------------------------------
>
>
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx