Hi list,
we have a productive Hammer cluster for our OpenStack cloud and
recently a colleague added a cache tier consisting of 2 SSDs and also
a pool size of 2, we're still experimenting with this topic.
Now we have some hardware maintenance to do and need to shutdown
nodes, one at a time of course. So we tried to flush/evict the cache
pool and disable it to prevent data loss, we also set the cache-mode
to "forward". Most of the objects have been evicted successfully, but
there are still 39 objects left, and it's impossible to evict them.
I'm not sure how to make sure if we can just delete the cache pool
without data loss, we want to set up the cache-pool from scratch.
# rados -p images-cache ls
rbd_header.210f542ae8944a
volume-ce17068e-a36d-4d9b-9779-3af473aba033.rbd
rbd_header.50ec372eb141f2
931f9a1e-2022-4571-909e-6c3f5f8c3ae8_disk.rbd
rbd_header.59dd32ae8944a
...
There are only 3 types of objects in the cache-pool:
- rbd_header
- volume-XXX.rbd (obviously cinder related)
- XXX_disk (nova disks)
All rbd_header objects have a size of 0 if I run a "stat" command on
them, the rest has a size of 112. If I compare the objects with the
respective object in the cold-storage, they are identical:
Object rbd_header.1128db1b5d2111:
images-cache/rbd_header.1128db1b5d2111 mtime 2017-08-21
15:55:26.000000, size 0
images/rbd_header.1128db1b5d2111 mtime 2017-08-21
15:55:26.000000, size 0
Object volume-fd07dd66-8a82-431c-99cf-9bfc3076af30.rbd:
images-cache/volume-fd07dd66-8a82-431c-99cf-9bfc3076af30.rbd mtime
2017-08-21 15:55:26.000000, size 112
images/volume-fd07dd66-8a82-431c-99cf-9bfc3076af30.rbd mtime
2017-08-21 15:55:26.000000, size 112
Object 2dcb9d7d-3a4f-49a4-8792-b4b74f5b60e5_disk.rbd:
images-cache/2dcb9d7d-3a4f-49a4-8792-b4b74f5b60e5_disk.rbd mtime
2017-08-21 15:55:25.000000, size 112
images/2dcb9d7d-3a4f-49a4-8792-b4b74f5b60e5_disk.rbd mtime
2017-08-21 15:55:25.000000, size 112
Some of them have an rbd_lock, some of them have a watcher, some don't
have any of that but they still can't be evicted:
# rados -p images-cache lock list rbd_header.2207c92ae8944a
{"objname":"rbd_header.2207c92ae8944a","locks":[]}
# rados -p images-cache listwatchers rbd_header.2207c92ae8944a
#
# rados -p images-cache cache-evict rbd_header.2207c92ae8944a
error from cache-evict rbd_header.2207c92ae8944a: (16) Device or resource busy
Then I also tried to shutdown an instance that uses some of the
volumes listed in the cache pool, but the objects didn't change at
all, the total number was also still 39. For the rbd_header objects I
don't even know how to identify their "owner", is there a way?
Has anyone a hint what else I could check or is it reasonable to
assume that the objects are really the same and there would be no data
loss in case we deleted that pool?
We appreciate any help!
Regards,
Eugen
--
Eugen Block voice : +49-40-559 51 75
NDE Netzdesign und -entwicklung AG fax : +49-40-559 51 77
Postfach 61 03 15
D-22423 Hamburg e-mail : eblock@xxxxxx
Vorsitzende des Aufsichtsrates: Angelika Mozdzen
Sitz und Registergericht: Hamburg, HRB 90934
Vorstand: Jens-U. Mozdzen
USt-IdNr. DE 814 013 983
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com