I see... That is the info of one of my spinning disks: "bluefs": { "gift_bytes": 0, "reclaim_bytes": 0, "db_total_bytes": 30169620480, "db_used_bytes": 2517630976, "wal_total_bytes": 1073737728, "wal_used_bytes": 524288000, "slow_total_bytes": 400033841152, "slow_used_bytes": 3996123136, "num_files": 583, "log_bytes": 7798784, "log_compactions": 39, "logged_bytes": 786444288, "files_written_wal": 2, "files_written_sst": 48742, "bytes_written_wal": 2410376267722, "bytes_written_sst": 2620043565235 }, On my solid disks, I don't have any issues of slow_ "bluefs": { "gift_bytes": 0, "reclaim_bytes": 0, "db_total_bytes": 153631064064, "db_used_bytes": 6822035456, "wal_total_bytes": 0, "wal_used_bytes": 0, "slow_total_bytes": 0, "slow_used_bytes": 0, "num_files": 250, "log_bytes": 16420864, "log_compactions": 511, "logged_bytes": 9285406720, "files_written_wal": 2, "files_written_sst": 79316, "bytes_written_wal": 4393750671932, "bytes_written_sst": 4626359292945 }, In my understanding, the solid disks manage the db automatically, so it was reserved about 150GB of 3.84TB total disk. My spinning disk is showing about 384GB of slow_total_bytes, while I only have dedicated 33GB of nvme disk for each raw disk. Well, I believe I have to think about increasing my db partitions. Thank you for your feedback, Darren! Joao Victor R Soares Darren Soothill wrote: > HI Joao, > > You can see how much RocksDB space has been used with this command “ceph daemon osd.X perf > dump” Where X is an OSD id on the node you are running the command on. > > You are looking for this section in the output :- > "bluefs": { > "gift_bytes": 0, > "reclaim_bytes": 0, > "db_total_bytes": 23966253056, > "db_used_bytes": 1714421760, > "wal_total_bytes": 0, > "wal_used_bytes": 0, > "slow_total_bytes": 0, > "slow_used_bytes": 0, > "num_files": 24, > "log_bytes": 552120320, > "log_compactions": 0, > "logged_bytes": 537051136, > "files_written_wal": 1, > "files_written_sst": 11, > "bytes_written_wal": 429315193, > "bytes_written_sst": 601384180, > "bytes_written_slow": 0, > "max_bytes_wal": 0, > "max_bytes_db": 1714421760, > "max_bytes_slow": 0 > }, > > If you have numbers in the slow_ entries then your RocksDB is spilling over onto the > HDD. > > As to if moving RocksDb and WAL on HDD can cause a performance degradation then it depends > how busy your disks are. If you HDD’s are working hard and you are now going to throw a > lot more workload onto them then performance will degrade. Could be substantially. I have > seen performance impacts of upto 75% when things have started spilling over from NVME to > HDD. > By that I mean I had a lovely flat line ingesting objects and that line suddenly dropped > by 75% once the RocksDB had filled up and spilt over onto the HDD. > > > > > From: João Victor Rodrigues Soares <jvrs2683(a)gmail.com> > Date: Wednesday, 25 September 2019 at 14:37 > To: "ceph-users(a)ceph.io" <ceph-users(a)ceph.io> > Subject: Slow Write Issues > > Hello, > > In my company, we currently have the following infrastructure: > > - Ceph Luminous > - OpenStack Pike. > > We have a cluster of 3 osd nodes with the following configuration: > > - 1 x Xeon (R) D-2146NT CPU @ 2.30GHz > - 128GB RAM > - 128GB ROOT DISK > - 12 x 10TB SATA ST10000NM0146 (OSD) > - 1 x Intel Optane P4800X SSD DC 375GB (block.DB / block.wal) > - Ubuntu 16.04 > - 2 X 10Gb network interface configured with lacp > > > The compute nodes have > - 4 x 10Gb network interfaces with lacp. > > We also have 4 monitors with: > - 4 x 10Gb lacp network interfaces. > - The monitor nodes are approx. 90% cpu idle time with 32GB / 256GB available RAM > > For each OSD disk we have created a partition of 33GB to block.db and block.wal. > > We are recently facing a number of performance issues. Virtual machines created in > OpenStack are experiencing slow writing issues (approx. 50MB / s). > > The OSD nodes monitoring incur an average of 20% cpu IOwait time and 70 cpu idle time. > The memory consumption is around 30% consumption. > We have no latency issues (9ms average) > > My question is if what is happening may have to do with the amount of disk dedicated to DB > / WAL. In the CEPH documentation it says it is recommended that the block.db size is not > smaller than 4% of block. > > In this case for each disk in my environment block.db could not be less than 400GB / > OSD. > > Another question is if I set my disks to use block.db / block.wal on the mechanical disks > themselves, if that could lead to a performance degradation. > > Att. > João Victor Rodrigues Soares _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx