On 10/11/2018 07:14 AM, Nick Fisk wrote:
-----Original Message-----
From: ceph-devel-owner@xxxxxxxxxxxxxxx [mailto:ceph-devel-owner@xxxxxxxxxxxxxxx] On Behalf Of Sage Weil
Sent: 10 October 2018 23:37
To: Nick Fisk <nick@xxxxxxxxxx>
Cc: ceph-devel@xxxxxxxxxxxxxxx
Subject: Re: Bluestore deferred writes for new objects
On Wed, 10 Oct 2018, Nick Fisk wrote:
Following up from a discussion on the performance call last week.
Is anybody able to confirm the behaviour of Bluestore deferred writes with new objects?
From my testing it appears that new object are always directly written
to the underlying block device and not buffered into flash, whereas existing objects <64KB are.
I just did a quick test (on master) and it looks like the deferred writes are working as expected in that they *do* apply to new objects
(as well as existing ones). Can you be a bit more specific about what you observed? (which version? what workload?)
This is on Mimic 13.2.2, 7.2k disks with SSD for DB. Three observations I have made.
1. RADOS bench doing QD=1 4k objects is a lot slower than writing with FIO (directio) QD=1 4kb IO's to a fully thickened RBD (about 10x)
2. RADOS bench seems to increment the bluestore_write_small_new counter, whereas the fio test increments bluestore_write_small_deferred. Although deferred_write_ops looks like it increases in both cases
3. Compared to Filestore the RADOS bench test is also slower (again about 10x)
I've noticed slower write behavior when creating objects (RBD prefill
and rados) than overwrites to existing RBD objects. It wasn't anywhere
near 10x, but I wasn't focusing on the QD=1 use case either. I've got a
pile of things I need to work on, but I don't want to lose this one
because this is an important case to track down. I want to try to
replicate it in-house next week.
Nick, can you send me the rados bench and fio cmdline you are using? I
imagine so long as it's low QD and object creates vs RBD overwrites it
should be pretty obvious, but the exact invocations wouldn't hurt to have.
Mark