Re: Removing cache tier for RBD pool

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hello Mike,

Zitat von Mike Lovell <mike.lovell@xxxxxxxxxxxxx>:
On Mon, Jan 8, 2018 at 6:08 AM, Jens-U. Mozdzen <jmozdzen@xxxxxx> wrote:
Hi *,
[...]
1. Does setting the cache mode to "forward" lead to above situation of
remaining locks on hot-storage pool objects? Maybe the clients' unlock
requests are forwarded to the cold-storage pool, leaving the hot-storage
objects locked? If so, this should be documented and it'd seem impossible
to cleanly remove a cache tier during live operations.

2. What is the significant difference between "rados
cache-flush-evict-all" and separate "cache-flush" and "cache-evict" cycles?
Or is it some implementation error that leads to those "file not found"
errors with "cache-flush-evict-all", while the manual cycles work
successfully?

Thank you for any insight you might be able to share.

Regards,
Jens


i've removed a cache tier in environments a few times. the only locked
files i ran into were the rbd_directory and rbd_header objects for each
volume. the rbd_headers for each rbd volume are locked as long as the vm is
running. every time i've tried to remove a cache tier, i shutdown all of
the vms before starting the procedure and there wasn't any problem getting
things flushed+evicted. so i can't really give any further insight into
what might have happened other than it worked for me. i set the cache-mode
to forward everytime before flushing and evicting objects.

while your report doesn't confirm my suspicion expressed in my question 1, it at least is another example where removing the cache worked *after stopping all instances*, rather than "live". If, OTOH, this limitation is confirmed, it should be added to the docs.

Out of curiosity: Do you have any other users for the pool? After stopping all VMs (and the image-related services on our Openstack control nodes), my pool was without access, so I saw no need to put the caching tier to "forward" mode.

i don't think there really is a significant technical difference between
the cache-flush-evict-all command and doing separate cache-flush and
cache-evict on individual objects. my understanding is
cache-flush-evict-all is just a short cut to getting everything in the
cache flushed and evicted. did the cache-flush-evict-all error on some
objects where the separate operations succeeded? you're description doesn't
say if there was but then you say you used both styles during your second
attempt.

It was actually that every run of "cache-flush-evict-all" did report errors on all remaining objects, while running the loop manually (issue flush for every objects, then issue evict for every remaining object) did work flawlessly. That's why my question 2 came up.

The objects I saw seemed related to the images stored in the pool, not any "management data" (like the suggested hitset persistence).

on a different note, you say that your cluster is on 12.2 but the cache
tiers were created on an earlier version. which version was the cache tier
created on? how well did the upgrade process work? i am curious since the
production clusters i have using a cache tier are still on 10.2 and i'm
about to begin the process of testing the upgrade to 12.2. any info on that
experience you can share would be helpful.

I *believe* the cache was created on 10.2, but cannot recall for sure. I remember having had similar problems in those earlier days with a previous instance of that caching tier, but many root causes were "on my side of the keyboard". The cache tier I was trying to remove recently was created from scratch after those problems, and upgrading to the latest release via the recommended intermediate version steps was problem-free. At least when focusing on the subject of cache tiers ;)

Regards,
Jens

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux