>> >> Today we had a big issue with slow ops on the nvme drives which holding >> the index pool. >> >> Why the nvme shows full if on ceph is barely utilized? Which one I should >> belive? >> >> When I check the ceph osd df it shows 10% usage of the osds (1x 2TB nvme >> drive has 4x osds on it): Why split each device into 4 very small OSDs? You're losing a lot of capacity to overhead. >> >> ID CLASS WEIGHT REWEIGHT SIZE RAW USE DATA OMAP META AVAIL %USE VAR PGS STATUS >> 195 nvme 0.43660 1.00000 447 GiB 47 GiB 161 MiB 46 GiB 656 MiB 400 GiB 10.47 0.21 64 up >> 252 nvme 0.43660 1.00000 447 GiB 46 GiB 161 MiB 45 GiB 845 MiB 401 GiB 10.35 0.21 64 up >> 253 nvme 0.43660 1.00000 447 GiB 46 GiB 229 MiB 45 GiB 662 MiB 401 GiB 10.26 0.21 66 up >> 254 nvme 0.43660 1.00000 447 GiB 46 GiB 161 MiB 44 GiB 1.3 GiB 401 GiB 10.26 0.21 65 up >> 255 nvme 0.43660 1.00000 447 GiB 47 GiB 161 MiB 46 GiB 1.2 GiB 400 GiB 10.58 0.21 64 up >> 288 nvme 0.43660 1.00000 447 GiB 46 GiB 161 MiB 44 GiB 1.2 GiB 401 GiB 10.25 0.21 64 up >> 289 nvme 0.43660 1.00000 447 GiB 46 GiB 161 MiB 45 GiB 641 MiB 401 GiB 10.33 0.21 64 up >> 290 nvme 0.43660 1.00000 447 GiB 45 GiB 229 MiB 44 GiB 668 MiB 402 GiB 10.14 0.21 65 up >> >> However in nvme list it says full: >> Node SN Model Namespace Usage Format FW Rev >> ---------------- -------------------- --------------------------------------- --------- >> -------------------------- ---------------- -------- >> /dev/nvme0n1 90D0A00XTXTR KCD6XLUL1T92 1 1.92 TB / 1.92 TB 512 B + 0 B GPK6 >> /dev/nvme1n1 60P0A003TXTR KCD6XLUL1T92 1 1.92 TB / 1.92 TB 512 B + 0 B GPK6 That command isn't telling you what you think it is. It has no awareness of actual data, it's looking at NVMe namespaces. >> >> With some other node the test was like: >> >> * if none of the disk full, no slow ops. >> * If 1x disk full and the other not, has slow ops but not too much >> * if none of the disk full, no slow ops. >> >> The full disks are very highly utilized during recovery and they are >> holding back the operations from the other nvmes. >> >> What's the reason that even if the pgs are the same in the cluster +/-1 >> regarding space they are not equally utilized. >> >> Thank you >> >> >> >> ________________________________ >> This message is confidential and is for the sole use of the intended >> recipient(s). It may also be privileged or otherwise protected by copyright >> or other legal rules. If you have received it by mistake please let us know >> by reply email and delete it from your system. It is prohibited to copy >> this message or disclose its content to anyone. Any confidentiality or >> privilege is not waived or lost by any mistaken delivery or unauthorized >> disclosure of the message. All messages sent to and from Agoda may be >> monitored to ensure compliance with company policies, to protect the >> company's interests and to remove potential malware. Electronic messages >> may be intercepted, amended, lost or deleted, or contain viruses. >> _______________________________________________ >> ceph-users mailing list -- ceph-users@xxxxxxx >> To unsubscribe send an email to ceph-users-leave@xxxxxxx >> > _______________________________________________ > ceph-users mailing list -- ceph-users@xxxxxxx > To unsubscribe send an email to ceph-users-leave@xxxxxxx _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx