Re: [RFC 0/2] rbd: respect REQ_NOUNMAP by setting new nounmap flag for ZERO op

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Jan 24, 2019 at 5:45 PM David Disseldorp <ddiss@xxxxxxx> wrote:
>
> On Mon, 21 Jan 2019 08:58:22 -0500, Jason Dillaman wrote:
>
> > > Indeed, the users of REQ_NONUMAP are ioctl() and fallocate(), so the
> > > only
> > > practical value which comes to mind is performance (preallocate zeroed
> > > blocks and format any fs, etc) and possible secure-erase.  After some
> > > internal discussions about performance of writing zeroes (instead of
> > > true DISCARD) this seems does not bring any value, at least on
> > > bluestore,
> > > but secure wipe can make sense (for example using blkdiscard --zerouut).
>
> Another possible use case could be space reservations for thin (over)
> provisioned storage. E.g. I don't have anything to write now, but want
> to make sure that the array won't reject writes to this region in
> future.

I think you might be conflating REQ_NONUMAP with UNMAP flag to WRITE
SAME (and the way zeroout is implemented for SCSI).  As I understand it,
REQ_NOUNMAP means "try to avoid deallocating when zeroing", definitely
not "allocate if not yet allocated".

>
> > The zeroed writes would need to be smaller than the bluestore min
> > alloc size for that to work. Otherwise, bluestore will just allocate a
> > new blob extent, write zeroes to it, and pivot the object metadata to
> > point to the new allocation.
>
> I think we'll need some form of only-ack-when-the-previous-data-is-gone
> guarantee from bluestore in future, at least if we want to work towards
> supporting things like REQ_OP_SECURE_ERASE on the client side.

bluestore will also need to be taught to do BLKSECDISCARD instead of
regular BLKDISCARD, propagate it through the kv store, etc.  We will
also need to know whether secure erase is supported by all underlying
devices so that we can set QUEUE_FLAG_SECERASE at "rbd map" time and
then deal with the fact that it might change over time when OSDs get
replaced, etc.

zeroout and secure erase are two very different things...

Thanks,

                Ilya



[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux