On 03/22/2013 12:09 PM, Oliver Francke wrote:
Hi Josh, all, I did not want to hijack the thread dealing with a crashing VM, but perhaps there are some common things. Today I installed a fresh cluster with mkephfs, went fine, imported a "master" debian 6.0 image with "format 2", made a snapshot, protected it, and made some clones. Clones mounted with qemu-nbd, fiddled a bit with IP/interfaces/hosts/net.rules…etc and cleanly unmounted, VM started, took 2 secs and the VM was up n running. Cool. Now an ordinary shutdown was performed, made a snapshot of this image. Started again, did some "apt-get update… install s/t…". Shutdown -> rbd rollback -> startup again -> login -> install s/t else… filesystem showed "many" ex3-errors, fell into read-only mode, massive corruption.
This sounds like it might be a bug in rollback. Could you try cloning and snapshotting again, but export the image before booting, and after rolling back, and compare the md5sums? Running the rollback with: --debug-ms 1 --debug-rbd 20 --log-file rbd-rollback.log might help too. Does your ceph.conf where you ran the rollback have anything related to rbd_cache in it?
qemu config was with ":rbd_cache=false" if it matters. Above scenario is reproducible, and as I stated out, no crash detected. Perhaps it is in the same area as in the crash-thread, otherwise I will provide logfiles as needed.
It's unrelated, the other thread is an issue with the cache, which does not cause corruption but triggers a crash. Josh -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html