On Tue, 6 Jun 2017 10:25:38 +0800 TYLin wrote: > > On Jun 5, 2017, at 6:47 PM, Christian Balzer <chibi@xxxxxxx> wrote: > > > > Personally I avoid odd numbered releases, but my needs for stability > > and low update frequency seem to be far off the scale for "normal" Ceph > > users. > > > > W/o precise numbers of files and the size of your SSDs (which type?) it is > > hard to say, but you're likely to be better off just having all metadata > > on an SSD pool instead of cache-tiering. > > 800MB/s sounds about right for your network and cluster in general (no > > telling for sure w/o SSD/HDD details of course). > > > > As I pointed out before and will try to explain again below, that speed > > difference, while pretty daunting, isn't all that surprising. > > > > SSD: Intel S3520 240GB At a theoretical maximum speed of 300MB/s per drive this explains your 800MB/s (in conjunction with your network). These SSDs have an endurance of about 1 DWPD, I'd be monitoring them closely for wear-out. Christian > HDD: WDC WD5003ABYZ-011FA0 500GB > fio: bs=4m iodepth=32 > dd: bs=4m > The test file is 20GB. > > > No, not quite. Re-read what I wrote, there's a difference between RADOS > > object creation and actual data (contents). > > > > The devs or other people with more code familiarity will correct me, but > > essentially as I understand it this happens when a new RADOS object gets > > created in conjunction with a cache-tier: > > > > 1. Client (cephfs, rbd, whatever) talks to the cache-tier and the > > transaction causes a new object to be created. > > Since the tier is an overlay of the actual backing storage, the object > > (but not necessarily the curent data in it) needs to exist on both. > > 2. Object gets created on backing storage which involves creating the > > file (at zero length), any needed directories above and the entry in the > > OMAP leveldb. All on HDDs, all slow. > > I'm pretty sure this needs to be done and finished before the object is > > usable, no journals to speed this up. > > 3. Cache-tier pseudo-promotes the new object (it is empty after all) and > > starts accepting writes. > > > > This is leaving out any metadata stuff CephFS needs to do for new "blocks" > > and files, which may also be more involved than overwrites. > > > > Christian > > You make it clear to me! thanks! Really appreciate your kind explanation. > > Thanks, > Ting Yi Lin -- Christian Balzer Network/Systems Engineer chibi@xxxxxxx Rakuten Communications _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com