You mean in the OSD logfiles? Am Mi., 23. März 2022 um 08:23 Uhr schrieb Szabo, Istvan (Agoda) < Istvan.Szabo@xxxxxxxxx>: > Can you see in the pg dump like waiting for reading or something like > this? In each step how much time it spends? > > > > Istvan Szabo > Senior Infrastructure Engineer > --------------------------------------------------- > Agoda Services Co., Ltd. > e: istvan.szabo@xxxxxxxxx > --------------------------------------------------- > > > > *From:* Boris Behrens <bb@xxxxxxxxx> > *Sent:* Wednesday, March 23, 2022 1:29 PM > *To:* Szabo, Istvan (Agoda) <Istvan.Szabo@xxxxxxxxx> > *Cc:* ceph-users@xxxxxxx > *Subject:* Re: Re: octopus (15.2.16) OSDs crash or don't > answer heathbeats (and get marked as down) > > > > Email received from the internet. If in doubt, don't click any link nor > open any attachment ! > ------------------------------ > > Good morning Istvan, > > those are rotating disks and we don't use EC. Splitting up the 16TB disks > into two 8TB partitions and have two OSDs on one disk also sounds > interesting, but would it solve the problem? > > > > I also thought to adjust the PGs for the data pool from 4096 to 8192. But > I am not sure if this will solve the problem or make it worse. > > > > Until now, everything I've tried didn't work. > > > > Am Mi., 23. März 2022 um 05:10 Uhr schrieb Szabo, Istvan (Agoda) < > Istvan.Szabo@xxxxxxxxx>: > > Hi, > > > > I think you are having similar issue as me in the past. > > > > I have 1.6B objects on a cluster average 40k and all my osd had spilled > over. > > > > Also slow ops, wrongly marked down… > > > > My osds are 15.3TB ssds, so my solution was to store block+db together on > the ssds, put 4 osd/ssd and go up to 100pg/osd so 1 disk holds 400pg approx. > > Also turned on balancer with upmap and max deviation 1. > > > > I’m using ec 4:2, let’s see how long it lasts. My bottleneck is always the > pg number, too small pg number for too many objects. > > > > Istvan Szabo > Senior Infrastructure Engineer > --------------------------------------------------- > Agoda Services Co., Ltd. > e: istvan.szabo@xxxxxxxxx > --------------------------------------------------- > > > > On 2022. Mar 22., at 23:34, Boris Behrens <bb@xxxxxxxxx> wrote: > > Email received from the internet. If in doubt, don't click any link nor > open any attachment ! > ________________________________ > > The number 180 PGs is because of the 16TB disks. 3/4 of our OSDs had cache > SSDs (not nvme though and most of them are 10OSDs one SSD) but this problem > only came in with octopus. > > We also thought this might be the db compactation, but it doesn't match up. > It might happen when the compactation run, but it looks also that it > happens, when there are other operations like table_file_deletion > and it happens on OSDs that have SSD backed block.db devices (like 5 OSDs > share one SAMSUNG MZ7KM1T9HAJM-00005 and the IOPS/throughput on the SSD is > not huge (100IOPS r/s 300IOPS w/s when compacting an OSD on it, and around > 50mb/s r/w throughput) > > I also can not reproduce it via "ceph tell osd.NN compact", so I am not > 100% sure it is the compactation. > > What do you mean with "grep for latency string"? > > Cheers > Boris > > Am Di., 22. März 2022 um 15:53 Uhr schrieb Konstantin Shalygin < > k0ste@xxxxxxxx>: > > > 180PG per OSD is usually overhead, also 40k obj per PG is not much, but I > > don't think this will works without block.db NVMe. I think your "wrong out > > marks" evulate in time of rocksdb compaction. With default log settings you > > can try to grep 'latency' strings > > > > Also, https://tracker.ceph.com/issues/50297 > > > > > > k > > Sent from my iPhone > > > > On 22 Mar 2022, at 14:29, Boris Behrens <bb@xxxxxxxxx> wrote: > > > > * the 8TB disks hold around 80-90 PGs (16TB around 160-180) > > * per PG we've around 40k objects 170m objects in 1.2PiB of storage > > > > > > > -- > Die Selbsthilfegruppe "UTF-8-Probleme" trifft sich diesmal abweichend im > groüen Saal. > _______________________________________________ > ceph-users mailing list -- ceph-users@xxxxxxx > To unsubscribe send an email to ceph-users-leave@xxxxxxx > > > ------------------------------ > This message is confidential and is for the sole use of the intended > recipient(s). It may also be privileged or otherwise protected by copyright > or other legal rules. If you have received it by mistake please let us know > by reply email and delete it from your system. It is prohibited to copy > this message or disclose its content to anyone. Any confidentiality or > privilege is not waived or lost by any mistaken delivery or unauthorized > disclosure of the message. All messages sent to and from Agoda may be > monitored to ensure compliance with company policies, to protect the > company's interests and to remove potential malware. Electronic messages > may be intercepted, amended, lost or deleted, or contain viruses. > > > > -- > > Die Selbsthilfegruppe "UTF-8-Probleme" trifft sich diesmal abweichend im > groüen Saal. > -- Die Selbsthilfegruppe "UTF-8-Probleme" trifft sich diesmal abweichend im groüen Saal. _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx