RE: newstore performance update

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Mark,
       You may miss this tunable:   newstore_sync_wal_apply, which is default to true, but would be better to make if false.
       If sync_wal_apply is true, WAL apply will be don synchronize (in kv_sync_thread) instead of WAL thread. See 
	if (g_conf->newstore_sync_wal_apply) {
	  _wal_apply(txc);
	} else {
	  wal_wq.queue(txc);
	}
        Tweaking this to false helps a lot in my setup. All other looks good.

         And, could you make WAL in a different partition but same SSD as DB? Then from IOSTAT -p , we can identify how much writes to DB and how much write to WAL. I am always seeing zero in my setup.

												Xiaoxi.

> -----Original Message-----
> From: ceph-devel-owner@xxxxxxxxxxxxxxx [mailto:ceph-devel-
> owner@xxxxxxxxxxxxxxx] On Behalf Of Mark Nelson
> Sent: Wednesday, April 29, 2015 9:09 PM
> To: kernel neophyte
> Cc: ceph-devel
> Subject: Re: newstore performance update
> 
> Hi,
> 
> ceph.conf file attached.  It's a little ugly because I've been playing with
> various parameters.  You'll probably want to enable debug newstore = 30 if
> you plan to do any debugging.  Also, the code has been changing quickly so
> performance may have changed if you haven't tested within the last week.
> 
> Mark
> 
> On 04/28/2015 09:59 PM, kernel neophyte wrote:
> > Hi Mark,
> >
> > I am trying to measure 4k RW performance on Newstore, and I am not
> > anywhere close to the numbers you are getting!
> >
> > Could you share your ceph.conf for these test ?
> >
> > -Neo
> >
> > On Tue, Apr 28, 2015 at 5:07 PM, Mark Nelson <mnelson@xxxxxxxxxx>
> wrote:
> >> Nothing official, though roughly from memory:
> >>
> >> ~1.7GB/s and something crazy like 100K IOPS for the SSD.
> >>
> >> ~150MB/s and ~125-150 IOPS for the spinning disk.
> >>
> >> Mark
> >>
> >>
> >> On 04/28/2015 07:00 PM, Venkateswara Rao Jujjuri wrote:
> >>>
> >>> Thanks for sharing; newstore numbers look lot better;
> >>>
> >>> Wondering if we have any base line numbers to put things into
> perspective.
> >>> like what is it on XFS or on librados?
> >>>
> >>> JV
> >>>
> >>> On Tue, Apr 28, 2015 at 4:25 PM, Mark Nelson <mnelson@xxxxxxxxxx>
> wrote:
> >>>>
> >>>> Hi Guys,
> >>>>
> >>>> Sage has been furiously working away at fixing bugs in newstore and
> >>>> improving performance.  Specifically we've been focused on write
> >>>> performance as newstore was lagging filestore but quite a bit
> >>>> previously.  A lot of work has gone into implementing libaio behind
> >>>> the scenes and as a result performance on spinning disks with SSD
> >>>> WAL (and SSD backed rocksdb) has improved pretty dramatically. It's
> >>>> now often beating filestore:
> >>>>
> >>>> http://nhm.ceph.com/newstore/newstore-5d96fe6-no_overlay.pdf
> >>>>
> >>>> On the other hand, sequential writes are slower than random writes
> >>>> when the OSD, DB, and WAL are all on the same device be it a
> >>>> spinning disk or SSD.
> >>>> In this situation newstore does better with random writes and
> >>>> sometimes beats filestore (such as in the everything-on-spinning
> >>>> disk tests, and when IO sizes are small in the everything-on-ssd
> >>>> tests).
> >>>>
> >>>> Newstore is changing daily so keep in mind that these results are
> >>>> almost assuredly going to change.  An interesting area of
> >>>> investigation will be why sequential writes are slower than random
> >>>> writes, and whether or not we are being limited by rocksdb ingest
> >>>> speed and how.
> >>>>
> >>>> I've also uploaded a quick perf call-graph I grabbed during the "all-SSD"
> >>>> 32KB sequential write test to see if rocksdb was starving one of
> >>>> the cores, but found something that looks quite a bit different:
> >>>>
> >>>> http://nhm.ceph.com/newstore/newstore-5d96fe6-no_overlay.pdf
> >>>>
> >>>> Mark
> >>>> --
> >>>> To unsubscribe from this list: send the line "unsubscribe
> >>>> ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx
> >>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> >>>
> >>>
> >>>
> >>>
> >> --
> >> To unsubscribe from this list: send the line "unsubscribe ceph-devel"
> >> in the body of a message to majordomo@xxxxxxxxxxxxxxx More
> majordomo
> >> info at  http://vger.kernel.org/majordomo-info.html
> > --
> > To unsubscribe from this list: send the line "unsubscribe ceph-devel"
> > in the body of a message to majordomo@xxxxxxxxxxxxxxx More
> majordomo
> > info at  http://vger.kernel.org/majordomo-info.html
> >
��.n��������+%������w��{.n����z��u���ܨ}���Ơz�j:+v�����w����ޙ��&�)ߡ�a����z�ޗ���ݢj��w�f





[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux