OK - seems my android email client (native samsung) messed up "in-reply-to" which confuses some MUA's. Apologies for that (&this) /M On Tue, Oct 20, 2015 at 09:45:25PM +0200, Martin Millnert wrote: > The thing that worries me with your next-gen design (actually your current > design aswell) is SSD wear. If you use Intel SSD at 10 DWPD, that's 12TB/day > per 64TB total. I guess use case dependant, and perhaps 1:4 write read ratio > is quite high in terms of writes as-is. > > You're also throughput-limiting yourself to the pci-e bw of the NVME device > (regardless of NVRAM/SSD). Compared to traditonal interface, that may be ok of > course in relative terms. NVRAM vs SSD here is simply a choice between wear > (NVRAM as journal minimum), and cache hit probability (size). > > Interesting thought experiment anyway for me, thanks for sharing Wido. > > /M > > > -------- Original message -------- > From: Wido den Hollander <wido@xxxxxxxx> > Date: 20/10/2015 16:00 (GMT+01:00) > To: ceph-users <ceph-users@xxxxxxxx> > Subject: Ceph OSDs with bcache experience > > Hi, > > In the "newstore direction" thread on ceph-devel I wrote that I'm using > bcache in production and Mark Nelson asked me to share some details. > > Bcache is running in two clusters now that I manage, but I'll keep this > information to one of them (the one at PCextreme behind CloudStack). > > In this cluster has been running for over 2 years now: > > epoch 284353 > fsid 0d56dd8f-7ae0-4447-b51b-f8b818749307 > created 2013-09-23 11:06:11.819520 > modified 2015-10-20 15:27:48.734213 > > The system consists out of 39 hosts: > > 2U SuperMicro chassis: > * 80GB Intel SSD for OS > * 240GB Intel S3700 SSD for Journaling + Bcache > * 6x 3TB disk > > This isn't the newest hardware. The next batch of hardware will be more > disks per chassis, but this is it for now. > > All systems were installed with Ubuntu 12.04, but they are all running > 14.04 now with bcache. > > The Intel S3700 SSD is partitioned with a GPT label: > - 5GB Journal for each OSD > - 200GB Partition for bcache > > root@ceph11:~# df -h|grep osd > /dev/bcache0 2.8T 1.1T 1.8T 38% /var/lib/ceph/osd/ceph-60 > /dev/bcache1 2.8T 1.2T 1.7T 41% /var/lib/ceph/osd/ceph-61 > /dev/bcache2 2.8T 930G 1.9T 34% /var/lib/ceph/osd/ceph-62 > /dev/bcache3 2.8T 970G 1.8T 35% /var/lib/ceph/osd/ceph-63 > /dev/bcache4 2.8T 814G 2.0T 30% /var/lib/ceph/osd/ceph-64 > /dev/bcache5 2.8T 915G 1.9T 33% /var/lib/ceph/osd/ceph-65 > root@ceph11:~# > > root@ceph11:~# lsb_release -a > No LSB modules are available. > Distributor ID: Ubuntu > Description: Ubuntu 14.04.3 LTS > Release: 14.04 > Codename: trusty > root@ceph11:~# uname -r > 3.19.0-30-generic > root@ceph11:~# > > "apply_latency": { > "avgcount": 2985023, > "sum": 226219.891559000 > } > > What did we notice? > - Less spikes on the disk > - Lower commit latencies on the OSDs > - Almost no 'slow requests' during backfills > - Cache-hit ratio of about 60% > > Max backfills and recovery active are both set to 1 on all OSDs. > > For the next generation hardware we are looking into using 3U chassis > with 16 4TB SATA drives and a 1.2TB NVM-E SSD for bcache, but we haven't > tested those yet, so nothing to say about it. > > The current setup is 200GB of cache for 18TB of disks. The new setup > will be 1200GB for 64TB, curious to see what that does. > > Our main conclusion however is that it does smoothen the I/O-pattern > towards the disks and that gives a overall better response of the disks. > > Wido > > _______________________________________________ > ceph-users mailing list > ceph-users@xxxxxxxxxxxxxx > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > _______________________________________________ > ceph-users mailing list > ceph-users@xxxxxxxxxxxxxx > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Attachment:
signature.asc
Description: Digital signature
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com