Hi Ilya,
On 2019-01-18 17:29, Ilya Dryomov wrote:
On Fri, Jan 18, 2019 at 3:56 PM Roman Penyaev <rpenyaev@xxxxxxx> wrote:
Hi all,
This is an attempt to split DISCARD and WRITE_ZEROES paths on krbd
side
when REQ_NOUNMAP flag is set for a block layer request.
Hi Roman,
I'm working on splitting DISCARD and WRITE_ZEROES handling right now.
The idea is to punt on small and/or unaligned discard requests which
don't actually free up any space but translate into a RADOS zero op.
I'm not changing how WRITE_ZEROES is implemented though, so this is
orthogonal to your work -- just wanted to give a heads up.
Good to know, thanks for telling me.
Currently both REQ_OP_DISCARD and REQ_OP_WRITE_ZEROES block layer
requests
fall down to CEPH_OSD_OP_ZERO request, which punches holes on osd
side.
With a new CEPH_OSD_OP_FLAG_ZERO_NOUNMAP flag for CEPH_OSD_OP_ZERO
request
osd can zero out blocks, instead of punching holes.
REQ_NOUNMAP is just a hint, the block device is free to ignore it.
IIRC the only way to control it from userspace is through fallocate(2):
FALLOC_FL_PUNCH_HOLE can unmap, while FALLOC_FL_ZERO_RANGE is supposed
to not unmap. Given that fallocate(2) on block devices is fairly new,
I'm curious if you have an application that actually cares in mind?
No, no. This is an attempt to follow block layer semantics, nothing
more.
Indeed, the users of REQ_NONUMAP are ioctl() and fallocate(), so the
only
practical value which comes to mind is performance (preallocate zeroed
blocks and format any fs, etc) and possible secure-erase. After some
internal discussions about performance of writing zeroes (instead of
true DISCARD) this seems does not bring any value, at least on
bluestore,
but secure wipe can make sense (for example using blkdiscard --zerouut).
--
Roman