By default, "rados bench" makes new object for each write. This is way slower than writing directly to preconditioned RBD image (pre-written objects). -- Piotr Dałek piotr.dalek@xxxxxxxxxxxx https://ovhcloud.com/ -----Original Message----- From: ceph-devel-owner@xxxxxxxxxxxxxxx <ceph-devel-owner@xxxxxxxxxxxxxxx> On Behalf Of Nick Fisk Sent: Thursday, October 11, 2018 2:14 PM To: 'Sage Weil' <sage@xxxxxxxxxxxx> Cc: ceph-devel@xxxxxxxxxxxxxxx Subject: RE: Bluestore deferred writes for new objects > -----Original Message----- > From: ceph-devel-owner@xxxxxxxxxxxxxxx > [mailto:ceph-devel-owner@xxxxxxxxxxxxxxx] On Behalf Of Sage Weil > Sent: 10 October 2018 23:37 > To: Nick Fisk <nick@xxxxxxxxxx> > Cc: ceph-devel@xxxxxxxxxxxxxxx > Subject: Re: Bluestore deferred writes for new objects > > On Wed, 10 Oct 2018, Nick Fisk wrote: > > Following up from a discussion on the performance call last week. > > > > Is anybody able to confirm the behaviour of Bluestore deferred writes with new objects? > > > > From my testing it appears that new object are always directly > > written to the underlying block device and not buffered into flash, whereas existing objects <64KB are. > > I just did a quick test (on master) and it looks like the deferred > writes are working as expected in that they *do* apply to new objects > (as well as existing ones). Can you be a bit more specific about what > you observed? (which version? what workload?) > This is on Mimic 13.2.2, 7.2k disks with SSD for DB. Three observations I have made. 1. RADOS bench doing QD=1 4k objects is a lot slower than writing with FIO (directio) QD=1 4kb IO's to a fully thickened RBD (about 10x) 2. RADOS bench seems to increment the bluestore_write_small_new counter, whereas the fio test increments bluestore_write_small_deferred. Although deferred_write_ops looks like it increases in both cases 3. Compared to Filestore the RADOS bench test is also slower (again about 10x) > Thanks! > sage