Hi Mark I was seeing 50%...Oh yeah, I go with newstore_aio = false, maybe aio already exploit the parallelism. It's interesting here, we have two way to parallel the IOs, 1.Sync_io(likely use DIO if the request is aligned) with multi WAL thread. (newstore_aio= false, newstore_sync_wal_apply = false, newstore_wal_threads = N) 2. asyn IO issue by kv_sync_thread(newstore_aio = true, newstore_sync_wal_apply = true, newstore_wal_threads=whatever, doesn't make sense ), Do we have any pre knowledge about which way is better on some kind of device? I suspect AIO will be better for HDD while sync_io+multithread will better in SSD. Xiaoxi > -----Original Message----- > From: ceph-devel-owner@xxxxxxxxxxxxxxx [mailto:ceph-devel- > owner@xxxxxxxxxxxxxxx] On Behalf Of Mark Nelson > Sent: Thursday, April 30, 2015 3:06 AM > To: Chen, Xiaoxi; kernel neophyte > Cc: ceph-devel > Subject: Re: newstore performance update > > Hi Xiaoxi, > > I just tried setting newstore_sync_wal_apply to false, but it seemed to make > very little difference for me. How much improvement were you seeing with > it? > > Mark > > On 04/29/2015 10:55 AM, Chen, Xiaoxi wrote: > > Hi Mark, > > You may miss this tunable: newstore_sync_wal_apply, which is > default to true, but would be better to make if false. > > If sync_wal_apply is true, WAL apply will be don synchronize (in > kv_sync_thread) instead of WAL thread. See > > if (g_conf->newstore_sync_wal_apply) { > > _wal_apply(txc); > > } else { > > wal_wq.queue(txc); > > } > > Tweaking this to false helps a lot in my setup. All other looks good. > > > > And, could you make WAL in a different partition but same SSD as DB? > Then from IOSTAT -p , we can identify how much writes to DB and how much > write to WAL. I am always seeing zero in my setup. > > > > > Xiaoxi. > > > >> -----Original Message----- > >> From: ceph-devel-owner@xxxxxxxxxxxxxxx [mailto:ceph-devel- > >> owner@xxxxxxxxxxxxxxx] On Behalf Of Mark Nelson > >> Sent: Wednesday, April 29, 2015 9:09 PM > >> To: kernel neophyte > >> Cc: ceph-devel > >> Subject: Re: newstore performance update > >> > >> Hi, > >> > >> ceph.conf file attached. It's a little ugly because I've been > >> playing with various parameters. You'll probably want to enable > >> debug newstore = 30 if you plan to do any debugging. Also, the code > >> has been changing quickly so performance may have changed if you > haven't tested within the last week. > >> > >> Mark > >> > >> On 04/28/2015 09:59 PM, kernel neophyte wrote: > >>> Hi Mark, > >>> > >>> I am trying to measure 4k RW performance on Newstore, and I am not > >>> anywhere close to the numbers you are getting! > >>> > >>> Could you share your ceph.conf for these test ? > >>> > >>> -Neo > >>> > >>> On Tue, Apr 28, 2015 at 5:07 PM, Mark Nelson <mnelson@xxxxxxxxxx> > >> wrote: > >>>> Nothing official, though roughly from memory: > >>>> > >>>> ~1.7GB/s and something crazy like 100K IOPS for the SSD. > >>>> > >>>> ~150MB/s and ~125-150 IOPS for the spinning disk. > >>>> > >>>> Mark > >>>> > >>>> > >>>> On 04/28/2015 07:00 PM, Venkateswara Rao Jujjuri wrote: > >>>>> > >>>>> Thanks for sharing; newstore numbers look lot better; > >>>>> > >>>>> Wondering if we have any base line numbers to put things into > >> perspective. > >>>>> like what is it on XFS or on librados? > >>>>> > >>>>> JV > >>>>> > >>>>> On Tue, Apr 28, 2015 at 4:25 PM, Mark Nelson <mnelson@xxxxxxxxxx> > >> wrote: > >>>>>> > >>>>>> Hi Guys, > >>>>>> > >>>>>> Sage has been furiously working away at fixing bugs in newstore > >>>>>> and improving performance. Specifically we've been focused on > >>>>>> write performance as newstore was lagging filestore but quite a > >>>>>> bit previously. A lot of work has gone into implementing libaio > >>>>>> behind the scenes and as a result performance on spinning disks > >>>>>> with SSD WAL (and SSD backed rocksdb) has improved pretty > >>>>>> dramatically. It's now often beating filestore: > >>>>>> > >>>>>> http://nhm.ceph.com/newstore/newstore-5d96fe6-no_overlay.pdf > >>>>>> > >>>>>> On the other hand, sequential writes are slower than random > >>>>>> writes when the OSD, DB, and WAL are all on the same device be it > >>>>>> a spinning disk or SSD. > >>>>>> In this situation newstore does better with random writes and > >>>>>> sometimes beats filestore (such as in the everything-on-spinning > >>>>>> disk tests, and when IO sizes are small in the everything-on-ssd > >>>>>> tests). > >>>>>> > >>>>>> Newstore is changing daily so keep in mind that these results are > >>>>>> almost assuredly going to change. An interesting area of > >>>>>> investigation will be why sequential writes are slower than > >>>>>> random writes, and whether or not we are being limited by rocksdb > >>>>>> ingest speed and how. > >>>>>> > >>>>>> I've also uploaded a quick perf call-graph I grabbed during the "all- > SSD" > >>>>>> 32KB sequential write test to see if rocksdb was starving one of > >>>>>> the cores, but found something that looks quite a bit different: > >>>>>> > >>>>>> http://nhm.ceph.com/newstore/newstore-5d96fe6-no_overlay.pdf > >>>>>> > >>>>>> Mark > >>>>>> -- > >>>>>> To unsubscribe from this list: send the line "unsubscribe > >>>>>> ceph-devel" in the body of a message to > majordomo@xxxxxxxxxxxxxxx > >>>>>> More majordomo info at > >>>>>> http://vger.kernel.org/majordomo-info.html > >>>>> > >>>>> > >>>>> > >>>>> > >>>> -- > >>>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" > >>>> in the body of a message to majordomo@xxxxxxxxxxxxxxx More > >> majordomo > >>>> info at http://vger.kernel.org/majordomo-info.html > >>> -- > >>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" > >>> in the body of a message to majordomo@xxxxxxxxxxxxxxx More > >> majordomo > >>> info at http://vger.kernel.org/majordomo-info.html > >>> > -- > To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the > body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at > http://vger.kernel.org/majordomo-info.html ��.n��������+%������w��{.n����z��u���ܨ}���Ơz�j:+v�����w����ޙ��&�)ߡ�a����z�ޗ���ݢj��w�f