> -----Original Message----- > From: ceph-users [mailto:ceph-users-bounces@xxxxxxxxxxxxxx] On Behalf Of Daznis > Sent: 09 January 2017 12:54 > To: ceph-users <ceph-users@xxxxxxxxxxxxxx> > Subject: Ceph cache tier removal. > > Hello, > > > I'm running preliminary test on cache tier removal on a live cluster, before I try to do that on a production one. I'm trying to avoid > downtime, but from what I noticed it's either impossible or I'm doing something wrong. My cluster is running Centos 7.2 and 0.94.9 > ceph. > > Example 1: > I'm setting the cache layer to forward. > 1. ceph osd tier cache-mode test-cache forward . > Then flushing the cache: > 1. rados -p test-cache cache-flush-evict-all Then I'm getting stuck with the some objects that can't be removed: > > rbd_header.29c3cdb2ae8944a > failed to evict /rbd_header.29c3cdb2ae8944a: (16) Device or resource busy > rbd_header.28c96316763845e > failed to evict /rbd_header.28c96316763845e: (16) Device or resource busy error from cache-flush-evict-all: (1) Operation not > permitted > These are probably the objects which have watchers attached. The current evict logic seems to be unable to evict these, hence the error. I'm not sure if anything can be done to work around this other than what you have tried...ie stopping the VM, which will remove the watcher. > I found a workaround for this. You can bypass these errors by running > 1. ceph osd tier remove-overlay test-pool > 2. turning off the VM's that are using them. > > For the second option. I can boot the VM's normally after recreating a new overlay/cauchetier. At this point everything is working fine, > but I'm trying to avoid downtime as it takes almost 8h to start and check everything to be in optimal condition. > > Now for the first part. I can remove the overlay and flush cache layer. And VM's are running fine with it removed. Issues start after I > have readed the cache layer to the cold pool and try to write/read from the disk. For no apparent reason VM's just freeze. And you > need to force stop/start all VM's to start working. Which pool are the VM's being pointed at base or cache? I'm wondering if it's something to do with the pool id changing? > > From what I have read about it all objects should leave cache tier and you don't have to "force" removing the tier with objects. > > Now onto the questions: > > 1. Is it normal for VPS to freeze while adding a cache layer/tier? > 2. Do VMS' need to be offline to remove caching layer? > 3. I have read somewhere that snapshots might interfere with cache > tier clean up. Is it true? 4. Are there some other ways to > remove the caching tier on a live system? > > > Regards, > > > Darius > _______________________________________________ > ceph-users mailing list > ceph-users@xxxxxxxxxxxxxx > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com