Hi Mark, You may miss this tunable: newstore_sync_wal_apply, which is default to true, but would be better to make if false. If sync_wal_apply is true, WAL apply will be don synchronize (in kv_sync_thread) instead of WAL thread. See if (g_conf->newstore_sync_wal_apply) { _wal_apply(txc); } else { wal_wq.queue(txc); } Tweaking this to false helps a lot in my setup. All other looks good. And, could you make WAL in a different partition but same SSD as DB? Then from IOSTAT -p , we can identify how much writes to DB and how much write to WAL. I am always seeing zero in my setup. Xiaoxi. > -----Original Message----- > From: ceph-devel-owner@xxxxxxxxxxxxxxx [mailto:ceph-devel- > owner@xxxxxxxxxxxxxxx] On Behalf Of Mark Nelson > Sent: Wednesday, April 29, 2015 9:09 PM > To: kernel neophyte > Cc: ceph-devel > Subject: Re: newstore performance update > > Hi, > > ceph.conf file attached. It's a little ugly because I've been playing with > various parameters. You'll probably want to enable debug newstore = 30 if > you plan to do any debugging. Also, the code has been changing quickly so > performance may have changed if you haven't tested within the last week. > > Mark > > On 04/28/2015 09:59 PM, kernel neophyte wrote: > > Hi Mark, > > > > I am trying to measure 4k RW performance on Newstore, and I am not > > anywhere close to the numbers you are getting! > > > > Could you share your ceph.conf for these test ? > > > > -Neo > > > > On Tue, Apr 28, 2015 at 5:07 PM, Mark Nelson <mnelson@xxxxxxxxxx> > wrote: > >> Nothing official, though roughly from memory: > >> > >> ~1.7GB/s and something crazy like 100K IOPS for the SSD. > >> > >> ~150MB/s and ~125-150 IOPS for the spinning disk. > >> > >> Mark > >> > >> > >> On 04/28/2015 07:00 PM, Venkateswara Rao Jujjuri wrote: > >>> > >>> Thanks for sharing; newstore numbers look lot better; > >>> > >>> Wondering if we have any base line numbers to put things into > perspective. > >>> like what is it on XFS or on librados? > >>> > >>> JV > >>> > >>> On Tue, Apr 28, 2015 at 4:25 PM, Mark Nelson <mnelson@xxxxxxxxxx> > wrote: > >>>> > >>>> Hi Guys, > >>>> > >>>> Sage has been furiously working away at fixing bugs in newstore and > >>>> improving performance. Specifically we've been focused on write > >>>> performance as newstore was lagging filestore but quite a bit > >>>> previously. A lot of work has gone into implementing libaio behind > >>>> the scenes and as a result performance on spinning disks with SSD > >>>> WAL (and SSD backed rocksdb) has improved pretty dramatically. It's > >>>> now often beating filestore: > >>>> > >>>> http://nhm.ceph.com/newstore/newstore-5d96fe6-no_overlay.pdf > >>>> > >>>> On the other hand, sequential writes are slower than random writes > >>>> when the OSD, DB, and WAL are all on the same device be it a > >>>> spinning disk or SSD. > >>>> In this situation newstore does better with random writes and > >>>> sometimes beats filestore (such as in the everything-on-spinning > >>>> disk tests, and when IO sizes are small in the everything-on-ssd > >>>> tests). > >>>> > >>>> Newstore is changing daily so keep in mind that these results are > >>>> almost assuredly going to change. An interesting area of > >>>> investigation will be why sequential writes are slower than random > >>>> writes, and whether or not we are being limited by rocksdb ingest > >>>> speed and how. > >>>> > >>>> I've also uploaded a quick perf call-graph I grabbed during the "all-SSD" > >>>> 32KB sequential write test to see if rocksdb was starving one of > >>>> the cores, but found something that looks quite a bit different: > >>>> > >>>> http://nhm.ceph.com/newstore/newstore-5d96fe6-no_overlay.pdf > >>>> > >>>> Mark > >>>> -- > >>>> To unsubscribe from this list: send the line "unsubscribe > >>>> ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx > >>>> More majordomo info at http://vger.kernel.org/majordomo-info.html > >>> > >>> > >>> > >>> > >> -- > >> To unsubscribe from this list: send the line "unsubscribe ceph-devel" > >> in the body of a message to majordomo@xxxxxxxxxxxxxxx More > majordomo > >> info at http://vger.kernel.org/majordomo-info.html > > -- > > To unsubscribe from this list: send the line "unsubscribe ceph-devel" > > in the body of a message to majordomo@xxxxxxxxxxxxxxx More > majordomo > > info at http://vger.kernel.org/majordomo-info.html > > ��.n��������+%������w��{.n����z��u���ܨ}���Ơz�j:+v�����w����ޙ��&�)ߡ�a����z�ޗ���ݢj��w�f