On Thu, Jan 24, 2019 at 5:45 PM David Disseldorp <ddiss@xxxxxxx> wrote: > > On Mon, 21 Jan 2019 08:58:22 -0500, Jason Dillaman wrote: > > > > Indeed, the users of REQ_NONUMAP are ioctl() and fallocate(), so the > > > only > > > practical value which comes to mind is performance (preallocate zeroed > > > blocks and format any fs, etc) and possible secure-erase. After some > > > internal discussions about performance of writing zeroes (instead of > > > true DISCARD) this seems does not bring any value, at least on > > > bluestore, > > > but secure wipe can make sense (for example using blkdiscard --zerouut). > > Another possible use case could be space reservations for thin (over) > provisioned storage. E.g. I don't have anything to write now, but want > to make sure that the array won't reject writes to this region in > future. I think you might be conflating REQ_NONUMAP with UNMAP flag to WRITE SAME (and the way zeroout is implemented for SCSI). As I understand it, REQ_NOUNMAP means "try to avoid deallocating when zeroing", definitely not "allocate if not yet allocated". > > > The zeroed writes would need to be smaller than the bluestore min > > alloc size for that to work. Otherwise, bluestore will just allocate a > > new blob extent, write zeroes to it, and pivot the object metadata to > > point to the new allocation. > > I think we'll need some form of only-ack-when-the-previous-data-is-gone > guarantee from bluestore in future, at least if we want to work towards > supporting things like REQ_OP_SECURE_ERASE on the client side. bluestore will also need to be taught to do BLKSECDISCARD instead of regular BLKDISCARD, propagate it through the kv store, etc. We will also need to know whether secure erase is supported by all underlying devices so that we can set QUEUE_FLAG_SECERASE at "rbd map" time and then deal with the fact that it might change over time when OSDs get replaced, etc. zeroout and secure erase are two very different things... Thanks, Ilya