Hello Igor, Am 30.04.20 um 15:52 schrieb Igor Fedotov: > 1) reset perf counters for the specific OSD > > 2) run bench > > 3) dump perf counters. This is OSD 0: # ceph tell osd.0 bench -f plain 12288000 4096 bench: wrote 12 MiB in blocks of 4 KiB in 6.70482 sec at 1.7 MiB/sec 447 IOPS https://pastebin.com/raw/hbKcU07g This is OSD 38: # ceph tell osd.38 bench -f plain 12288000 4096 bench: wrote 12 MiB in blocks of 4 KiB in 2.01763 sec at 5.8 MiB/sec 1.49k IOPS https://pastebin.com/raw/Tx2ckVm1 > Collecting disks' (both main and db) activity with iostat would be nice > too. But please either increase benchmark duration or reduce iostat > probe period to 0.1 or 0.05 second This gives me: # ceph tell osd.38 bench -f plain 122880000 4096 Error EINVAL: 'count' values greater than 12288000 for a block size of 4 KiB, assuming 100 IOPS, for 30 seconds, can cause ill effects on osd. Please adjust 'osd_bench_small_size_max_iops' with a higher value if you wish to use a higher 'count'. Stefan > > > Thanks, > > Igor > > On 4/28/2020 8:42 PM, Stefan Priebe - Profihost AG wrote: >> HI Igor, >> >> but the performance issue is still present even on the recreated OSD. >> >> # ceph tell osd.38 bench -f plain 12288000 4096 >> bench: wrote 12 MiB in blocks of 4 KiB in 1.63389 sec at 7.2 MiB/sec >> 1.84k IOPS >> >> vs. >> >> # ceph tell osd.10 bench -f plain 12288000 4096 >> bench: wrote 12 MiB in blocks of 4 KiB in 10.7454 sec at 1.1 MiB/sec 279 >> IOPS >> >> both baked by the same SAMSUNG SSD as block.db. >> >> Greets, >> Stefan >> >> Am 28.04.20 um 19:12 schrieb Stefan Priebe - Profihost AG: >>> Hi Igore, >>> Am 27.04.20 um 15:03 schrieb Igor Fedotov: >>>> Just left a comment at https://tracker.ceph.com/issues/44509 >>>> >>>> Generally bdev-new-db performs no migration, RocksDB might >>>> eventually do >>>> that but no guarantee it moves everything. >>>> >>>> One should use bluefs-bdev-migrate to do actual migration. >>>> >>>> And I think that's the root cause for the above ticket. >>> perfect - this removed all spillover in seconds. >>> >>> Greets, >>> Stefan >>> >>> >>>> Thanks, >>>> >>>> Igor >>>> >>>> On 4/24/2020 2:37 PM, Stefan Priebe - Profihost AG wrote: >>>>> No not a standalone Wal I wanted to ask whether bdev-new-db migrated >>>>> dB and Wal from hdd to ssd. >>>>> >>>>> Stefan >>>>> >>>>>> Am 24.04.2020 um 13:01 schrieb Igor Fedotov <ifedotov@xxxxxxx>: >>>>>> >>>>>> >>>>>> >>>>>> Unless you have 3 different types of disks beyond OSD (e.g. HDD, SSD, >>>>>> NVMe) standalone WAL makes no sense. >>>>>> >>>>>> >>>>>> On 4/24/2020 1:58 PM, Stefan Priebe - Profihost AG wrote: >>>>>>> Is Wal device missing? Do I need to run *bluefs-bdev-new-db and >>>>>>> Wal?* >>>>>>> >>>>>>> Greets, >>>>>>> Stefan >>>>>>> >>>>>>>> Am 24.04.2020 um 11:32 schrieb Stefan Priebe - Profihost AG >>>>>>>> <s.priebe@xxxxxxxxxxxx>: >>>>>>>> >>>>>>>> Hi Igor, >>>>>>>> >>>>>>>> there must be a difference. I purged osd.0 and recreated it. >>>>>>>> >>>>>>>> Now it gives: >>>>>>>> ceph tell osd.0 bench >>>>>>>> { >>>>>>>> "bytes_written": 1073741824, >>>>>>>> "blocksize": 4194304, >>>>>>>> "elapsed_sec": 8.1554735639999993, >>>>>>>> "bytes_per_sec": 131659040.46819863, >>>>>>>> "iops": 31.389961354303033 >>>>>>>> } >>>>>>>> >>>>>>>> What's wrong wiht adding a block.db device later? >>>>>>>> >>>>>>>> Stefan >>>>>>>> >>>>>>>> Am 23.04.20 um 20:34 schrieb Stefan Priebe - Profihost AG: >>>>>>>>> Hi, >>>>>>>>> if the OSDs are idle the difference is even more worse: >>>>>>>>> # ceph tell osd.0 bench >>>>>>>>> { >>>>>>>>> "bytes_written": 1073741824, >>>>>>>>> "blocksize": 4194304, >>>>>>>>> "elapsed_sec": 15.396707875000001, >>>>>>>>> "bytes_per_sec": 69738403.346825853, >>>>>>>>> "iops": 16.626931034761871 >>>>>>>>> } >>>>>>>>> # ceph tell osd.38 bench >>>>>>>>> { >>>>>>>>> "bytes_written": 1073741824, >>>>>>>>> "blocksize": 4194304, >>>>>>>>> "elapsed_sec": 6.8903985170000004, >>>>>>>>> "bytes_per_sec": 155831599.77624846, >>>>>>>>> "iops": 37.153148597776521 >>>>>>>>> } >>>>>>>>> Stefan >>>>>>>>> Am 23.04.20 um 14:39 schrieb Stefan Priebe - Profihost AG: >>>>>>>>>> Hi, >>>>>>>>>> Am 23.04.20 um 14:06 schrieb Igor Fedotov: >>>>>>>>>>> I don't recall any additional tuning to be applied to new DB >>>>>>>>>>> volume. And assume the hardware is pretty the same... >>>>>>>>>>> >>>>>>>>>>> Do you still have any significant amount of data spilled over >>>>>>>>>>> for these updated OSDs? If not I don't have any valid >>>>>>>>>>> explanation for the phenomena. >>>>>>>>>> just the 64k from here: >>>>>>>>>> https://tracker.ceph.com/issues/44509 >>>>>>>>>> >>>>>>>>>>> You might want to try "ceph osd bench" to compare OSDs under >>>>>>>>>>> pretty the same load. Any difference observed >>>>>>>>>> Servers are the same HW. OSD Bench is: >>>>>>>>>> # ceph tell osd.0 bench >>>>>>>>>> { >>>>>>>>>> "bytes_written": 1073741824, >>>>>>>>>> "blocksize": 4194304, >>>>>>>>>> "elapsed_sec": 16.091414781000001, >>>>>>>>>> "bytes_per_sec": 66727620.822242722, >>>>>>>>>> "iops": 15.909104543266945 >>>>>>>>>> } >>>>>>>>>> >>>>>>>>>> # ceph tell osd.36 bench >>>>>>>>>> { >>>>>>>>>> "bytes_written": 1073741824, >>>>>>>>>> "blocksize": 4194304, >>>>>>>>>> "elapsed_sec": 10.023828538, >>>>>>>>>> "bytes_per_sec": 107118933.6419194, >>>>>>>>>> "iops": 25.539143953780986 >>>>>>>>>> } >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> OSD 0 is a Toshiba MG07SCA12TA SAS 12G >>>>>>>>>> OSD 36 is a Seagate ST12000NM0008-2H SATA 6G >>>>>>>>>> >>>>>>>>>> SSDs are all the same like the rest of the HW. But both drives >>>>>>>>>> should give the same performance from their specs. The only other >>>>>>>>>> difference is that OSD 36 was directly created with the block.db >>>>>>>>>> device (Nautilus 14.2.7) and OSD 0 (14.2.8) does not. >>>>>>>>>> >>>>>>>>>> Stefan >>>>>>>>>> >>>>>>>>>>> On 4/23/2020 8:35 AM, Stefan Priebe - Profihost AG wrote: >>>>>>>>>>>> Hello, >>>>>>>>>>>> >>>>>>>>>>>> is there anything else needed beside running: >>>>>>>>>>>> ceph-bluestore-tool --path /var/lib/ceph/osd/ceph-${OSD} >>>>>>>>>>>> bluefs-bdev-new-db --dev-target /dev/vgroup/lvdb-1 >>>>>>>>>>>> >>>>>>>>>>>> I did so some weeks ago and currently i'm seeing that all osds >>>>>>>>>>>> originally deployed with --block-db show 10-20% I/O waits while >>>>>>>>>>>> all those got converted using ceph-bluestore-tool show 80-100% >>>>>>>>>>>> I/O waits. >>>>>>>>>>>> >>>>>>>>>>>> Also is there some tuning available to use more of the SSD? The >>>>>>>>>>>> SSD (block-db) is only saturated at 0-2%. >>>>>>>>>>>> >>>>>>>>>>>> Greets, >>>>>>>>>>>> Stefan >>>>>>>>>>>> _______________________________________________ >>>>>>>>>>>> ceph-users mailing list -- ceph-users@xxxxxxx >>>>>>>>>>>> To unsubscribe send an email to ceph-users-leave@xxxxxxx > _______________________________________________ > ceph-users mailing list -- ceph-users@xxxxxxx > To unsubscribe send an email to ceph-users-leave@xxxxxxx _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx