Hey Mark :) Am 16. August 2017 21:43:34 MESZ schrieb Mark Nelson <mnelson@xxxxxxxxxx>: >Hi Mehmet! > >On 08/16/2017 11:12 AM, Mehmet wrote: >> :( no suggestions or recommendations on this? >> >> Am 14. August 2017 16:50:15 MESZ schrieb Mehmet <ceph@xxxxxxxxxx>: >> >> Hi friends, >> >> my actual hardware setup per OSD-node is as follow: >> >> # 3 OSD-Nodes with >> - 2x Intel(R) Xeon(R) CPU E5-2603 v3 @ 1.60GHz ==> 12 Cores, no >> Hyper-Threading >> - 64GB RAM >> - 12x 4TB HGST 7K4000 SAS2 (6GB/s) Disks as OSDs >> - 1x INTEL SSDPEDMD400G4 (Intel DC P3700 NVMe) as Journaling >Device for >> 12 Disks (20G Journal size) >> - 1x Samsung SSD 840/850 Pro only for the OS >> >> # and 1x OSD Node with >> - 1x Intel(R) Xeon(R) CPU E5-2650 v3 @ 2.30GHz (10 Cores 20 >Threads) >> - 64GB RAM >> - 23x 2TB TOSHIBA MK2001TRKB SAS2 (6GB/s) Disks as OSDs >> - 1x SEAGATE ST32000445SS SAS2 (6GB/s) Disk as OSDs >> - 1x INTEL SSDPEDMD400G4 (Intel DC P3700 NVMe) as Journaling >Device for >> 24 Disks (15G Journal size) >> - 1x Samsung SSD 850 Pro only for the OS > >The single P3700 for 23 spinning disks is pushing it. They have high >write durability but based on the model that is the 400GB version? Yes. It is a 400GB Version. >If you are doing a lot of writes you might wear it out pretty fast and Actually the intel isdct tool says this One should alive 40 years ^^ (EnduranceAnalyzer). But this should be proofed ;) >it's >a single point of failure for the entire node (if it dies you have a >lot >of data dying with it). General unbalanced setups like this are >trickier to get performing well as well. > Yes. That is true. That could be happen to All of my 4 Nodes. Perhaps the chef should see what will happen before i can get Money to optimise the Nodes... >> >> As you can see, i am using 1 (one) NVMe (Intel DC P3700 NVMe – >400G) >> Device for whole Spinning Disks (partitioned) on each OSD-node. >> >> When „Luminous“ is available (as next LTE) i plan to switch vom >> „filestore“ to „bluestore“ 😊 >> >> As far as i have read bluestore consists of >> - „the device“ >> - „block-DB“: device that store RocksDB metadata >> - „block-WAL“: device that stores RocksDB „write-ahead journal“ >> >> Which setup would be usefull in my case? >> I Would setup the disks via "ceph-deploy". > >So typically we recommend something like a 1-2GB WAL partition on the >NVMe drive per OSD and use the remaining space for DB. If you run out >of DB space, bluestore will start using the spinning disks to store KV >data instead. I suspect this will still be the advice you will want to > >follow, though at some point having so many WAL and DB partitions on >the >NVMe may start becoming a bottleneck. Something like 63K sequential >writes to heavily fragmented objects might be worth testing, but in >most >cases I suspect DB and WAL on NVMe is still going to be faster. > Thanks thats what i expected. Another idea would be to replace a Spinning Disk of the Nodes with an intel ssd for wal/db... Perhaps for the dbs? - Mehmet >> >> Thanks in advance for your suggestions! >> - Mehmet >> >------------------------------------------------------------------------ >> >> ceph-users mailing list >> ceph-users@xxxxxxxxxxxxxx >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >> >> >> >> _______________________________________________ >> ceph-users mailing list >> ceph-users@xxxxxxxxxxxxxx >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >> >_______________________________________________ >ceph-users mailing list >ceph-users@xxxxxxxxxxxxxx >http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com