Hello, On Fri, 1 May 2015 15:45:41 +0200 Piotr Wachowicz wrote: > > yes SSD-Journal helps a lot (if you use the right SSDs) > > > > What SSDs to avoid for journaling from your experience? Why? > Read the rather countless SSD threads on this ML, use the archive and your google foo. Like the _current_ thread: "Possible improvements for a slow write speed (excluding independent SSD journals)" In short anything w/o "power caps" and the resulting speed degradation when it comes to DSYNC writes. > > > > > We're seeing very disappointing Ceph performance. We have 10GigE > > > interconnect (as a shared public/internal network). > > Which kind of CPU do you use for the OSD-hosts? > > > > > Intel(R) Xeon(R) CPU E5-2660 v2 @ 2.20GHz > > FYI, we are hosting VMs on our OSD nodes, but the VMs use very small > amounts of CPUs and RAM > Small write IOPS with Ceph cost a LOT of CPU cycles, but I suppose you're not limited by your CPU, yet. You may be once you have SSD journals. > > > > > We're wondering whether it makes sense to buy SSDs and put journals > > > on them. But we're looking for a way to verify that this will > > > actually help BEFORE we splash cash on SSDs. > > I can recommend the Intel DC S3700 SSD for journaling! In the beginning > > I started with different much cheaper models, but this was the wrong > > decision. > > > > What, apart from the price, made the difference? sustained read/write > bandwidth? IOPS? > See the thread mentioned above. > We're considering this one (PCI-e SSD). What do you think? > http://www.plextor-digital.com/index.php/en/M6e-BK/m6e-bk.html > PX-128M6e-BK > I think... Gamer consumer toy. And a website that doesn't give you actual endurance information, other than a flimsy 5 year "warranty". Depending on how many OSDs (how many are in your storage nodes?) you plan to put behind a journal SSD and the amount of writes you expect, there will be better, known to work options. Like the Intel DC models. > > Also, we're thinking about sharing one SSD between two OSDs. Any reason > why this would be a bad idea? > Failure domain. How many storage nodes and OSDs do you have? A 1:2 ratio is rather conservative actually, a lot of people are happy with 1:3 to 1:5. But that means your cluster must be able to survive the loss of 5 OSDs if that SSD fails and the resulting rebuild storm, something that small budget clusters don't tend to be capable of. Other than that, sufficient write bandwidth (about 70MB/s per SATA HDD). IOPS (see thread above) are unlikely to be the limiting factor with SSD journals. > > > > We're using Ceph for OpenStack storage (kvm). Enabling RBD cache > > > didn't really help all that much. > > The read speed can be optimized with an bigger read ahead cache inside > > the VM, like: > > echo 4096 > /sys/block/vda/queue/read_ahead_kb > Yup, that can help alot with reads, but I think Nick has your problem nailed. Christian > > > Thanks, we will try that. -- Christian Balzer Network/Systems Engineer chibi@xxxxxxx Global OnLine Japan/Fusion Communications http://www.gol.com/ _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com