Re: Ceph with Cache pool - disk usage / cleanup

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,


On 09/29/2016 02:52 PM, Sascha Vogt wrote:
Hi,

Am 29.09.2016 um 13:45 schrieb Burkhard Linke:
On 09/29/2016 01:34 PM, Sascha Vogt wrote:
We have a huge amount of short lived VMs which are deleted before they
are even flushed to the backing pool. Might this be the reason, that
ceph doesn't handle that particular thing well? Eg. when deleting an
object / RBD image which has not been flushed, that the "deletion
mechanism" only deletes whats in the backing pool and if there is
nothing it skips deleting the marker files in the cache pool?
You should be able to validate this. Create a new rbd in the pool, map
it, write some data to it (few MB should be sufficient), note its rbd
prefix (rbd info <rbd>), and remove the rbd.

Then check whether objects with the prefix exists in the cache pool or
the backend pool. If such objects exists, try to flush/evict it manually
(rados cache-flush / cache-evict) and check whether the object is still
present in the pools.
Took a while and a lot more numbers we're seeing is starting to make
sense now.

After rbd rm the objects are still present in the cache pool, stat on
any of those objects returns the "No such file or directory" error and
no object is on the backing pool.

That explains our values we're seeing on "ceph df detail" which is:

ephemeral-vms: 910880 objects
ssd: 109096429 objects <- constantly growing... only rarely dropping
number. The number drops when an Openstack

-> calculating each "missing" object with 4k we end up with around 411
GB, one replica and we have our around 800 GB missing space :(

Good thing: Evicting an object where stat returns the error removes it.
So I'm now listing all objects in the SSD pool and then trying to evict
those who return a "No such file" when stating them. Hopefully that
doesn't break anything.

Question: Do I need a flush before the evict? Just in case? Or what
happens if I call evict on an object which is technically not a dead
object and needs to be flushed first?
AFAIK evicting an object also flushes it to the backing storage, so evicting a live object should be ok. It will be promoted again at the next access (or whatever triggers promotion in the caching mechanism).

For the dead 0-byte files: Should I open a bug report?
Not sure whether this is a bug at all. The objects should be evicted and removed if the cache pool hits the max object thresholds.


Regards,
Burkhard
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux