Re: Removing cache tier for RBD pool

Mike Lovell <mike.lovell@xxxxxxxxxxxxx> · Mon, 15 Jan 2018 15:33:53 -0700

On Mon, Jan 8, 2018 at 6:08 AM, Jens-U. Mozdzen <jmozdzen@xxxxxx> wrote:
Hi *,

trying to remove a caching tier from a pool used for RBD / Openstack, we followed the procedure from http://docs.ceph.com/docs/master/rados/operations/cache-tiering/#removing-a-writeback-cache and ran into problems.

The cluster is currently running Ceph 12.2.2, the caching tier was created with an earlier release of Ceph.

First of all, setting the cache-mode to "forward" is reported to be unsafe, which is not mentioned in the documentation - if it's really meant to be used in this case, the need for "--yes-i-really-mean-it" should be documented.

Unfortunately, using "rados -p hot-storage cache-flush-evict-all" not only reported errors ("file not found") for many objects, but left us with quite a number of objects in the pool and new ones being created, despite the "forward" mode. Even after stopping all Openstack instances ("VMs"), we could also see that the remaining objects in the pool were still locked. Manually unlocking these via rados commands worked, but "cache-flush-evict-all" then still reported those "file not found" errors and 1070 objects remained in the pool, like before. We checked the remaining objects via "rados stat" both in the hot-storage and the cold-storage pool and could see that every hot-storage object had a counter-part in cold-storage with identical stat info. We also compared some of the objects (with size > 0) and found the hot-storage and cold-storage entities to be identical.

We aborted that attempt, reverted the mode to "writeback" and restarted the Openstack cluster - everything was working fine again, of course still using the cache tier.

During a recent maintenance window, the Openstack cluster was shut down again and we re-tried the procedure. As there were no active users of the images pool, we skipped the step of forcing the cache mode to forward and immediately issued the "cache-flush-evict-all" command. Again 1070 objects remained in the hot-storage pool (and gave "file not found" errors), but unlike last time, none were locked.

Out of curiosity we then issued loops of "rados -p hot-storage cache-flush <obj-name>" and "rados -p hot-storage cache-evict <obj-name>" for all objects in the hot-storage pool and surprisingly not only received no error messages at all, but were left with an empty hot-storage pool! We then proceeded with the further steps from the docs and were able to successfully remove the cache tier.

This leaves us with two questions:

1. Does setting the cache mode to "forward" lead to above situation of remaining locks on hot-storage pool objects? Maybe the clients' unlock requests are forwarded to the cold-storage pool, leaving the hot-storage objects locked? If so, this should be documented and it'd seem impossible to cleanly remove a cache tier during live operations.

2. What is the significant difference between "rados cache-flush-evict-all" and separate "cache-flush" and "cache-evict" cycles? Or is it some implementation error that leads to those "file not found" errors with "cache-flush-evict-all", while the manual cycles work successfully?

Thank you for any insight you might be able to share.

Regards,

Jens

i've removed a cache tier in environments a few times. the only locked files i ran into were the rbd_directory and rbd_header objects for each volume. the rbd_headers for each rbd volume are locked as long as the vm is running. every time i've tried to remove a cache tier, i shutdown all of the vms before starting the procedure and there wasn't any problem getting things flushed+evicted. so i can't really give any further insight into what might have happened other than it worked for me. i set the cache-mode to forward everytime before flushing and evicting objects.

i don't think there really is a significant technical difference between the cache-flush-evict-all command and doing separate cache-flush and cache-evict on individual objects. my understanding is cache-flush-evict-all is just a short cut to getting everything in the cache flushed and evicted. did the cache-flush-evict-all error on some objects where the separate operations succeeded? you're description doesn't say if there was but then you say you used both styles during your second attempt.

there being objects left in the hot storage pool is something i've seen, even after it looks like everything has been flushed. when i dug deeper, it looked like all of the objects left in the pool were the hitset objects that the cache tier uses for tracking how frequently objects are used. those hitsets need to be persisted in case an osd restarts or the pg is migrated to another osd. the method it uses for that is just storing the hitset as another object but one that is internal to ceph. since they're internal, the objects are hidden from some commands like "rados ls" but still get counted as objects in the pool for things like ceph df. this can make things a little confusing. i verified this each time i did this by looking in the osd to see what objects are left in the pg. all of them started with hit\uset.

on a different note, you say that your cluster is on 12.2 but the cache tiers were created on an earlier version. which version was the cache tier created on? how well did the upgrade process work? i am curious since the production clusters i have using a cache tier are still on 10.2 and i'm about to begin the process of testing the upgrade to 12.2. any info on that experience you can share would be helpful.

mike
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com