Re: rbd ssd pool for (windows) vms

Janne Johansson <icepic.dz@xxxxxxxxx> · Mon, 6 May 2019 10:26:32 +0200

Den mån 6 maj 2019 kl 10:03 skrev Marc Roos <M.Roos@xxxxxxxxxxxxxxxxx>:

Yes but those 'changes' can be relayed via the kernel rbd driver not? 

Besides I don't think you can move a rbd block device being used to a 

different pool anyway. 

No, but you can move the whole pool, which takes all RBD images with it.

On the manual page [0] there is nothing mentioned about configuration 

settings needed for rbd use. Nor for ssd. They are also using in the 

example the virtio/vda, while I learned here that you should use the 

virtio-scsi/sda.

There are differences here, one is "tell guest to use virtio-scsi" that will allow
TRIM from a guest to become a possibility to reclaim space on thin-provisioned
RBD images, and that is probably a good thing.

That doesn't mean the guest TRIM commands will pass on to the pool OSD storage sectors underneath.

So you don't gain anything directly on the end devices in that regard to let a guest know
that it currently is lying on SSDs or HDDs, because the guest will not be sending SSD
commands to the real device. Inversely, the TRIMs sent from a guest would allow re-thinning
on a HDD pool aswell, since it's not a factor of the underlying devices, but rather the ceph
code and pool/rbd properties which are the same regardless.

Also, if the guest makes other decisions based on if there is HDD or SSD underneath,
those decisions can be wrong both ways, like "I was told its hdd and therefore I assume
only X iops are possible" where the kvm librbd layer can cache tons of things for you,
aswell as filestore OSD ram caches could give you awesome RAM-like write performance
not seen on normal HDDs ever. (at the risk of dataloss in the worst case)

On the other hand, being told as a guest that there is SSD or NVME
and deciding 100k iops should be the norm from now on would be equally wrong the
other way around if your ceph network between the guest and OSDs prevent you from
doing more than 1k iops.

If you find out that there is no certain way to tell a guest about where it really is stored,
that may actually be a conscious decision that is for the best. Let the guest try to do as
much IO as it think it needs and get the results when they are ready.

(a nod to Xen that likes to tell its guests they are on IDE drives so guests never send out
more than one IO request at a time because IDE just don't have that concept, regardless
of how fancy host you have with super-deep request queues and all... 8-/ )

-- 
May the most significant bit of your life be positive.

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com