Hey Yan, On 11-09-13 15:12, Yan, Zheng wrote: > On Wed, Sep 11, 2013 at 7:51 PM, Oliver Daudey <oliver@xxxxxxxxx> wrote: >> Hey Gregory, >> >> I wiped and re-created the MDS-cluster I just mailed about, starting out >> by making sure CephFS is not mounted anywhere, stopping all MDSs, >> completely cleaning the "data" and "metadata"-pools using "rados >> --pool=<pool> cleanup <prefix>", then creating a new cluster using `ceph >> mds newfs 1 0 --yes-i-really-mean-it' and starting all MDSs again. >> Directly afterwards, I saw this: >> # rados --pool=metadata ls >> 1.00000000 >> 2.00000000 >> 200.00000000 >> 200.00000001 >> 600.00000000 >> 601.00000000 >> 602.00000000 >> 603.00000000 >> 605.00000000 >> 606.00000000 >> 608.00000000 >> 609.00000000 >> mds0_inotable >> mds0_sessionmap >> >> Note the missing objects, right from the start. I was able to mount the >> CephFS at this point, but after unmounting it and restarting the >> MDS-cluster, it failed to come up, with the same symptoms as before. I >> didn't place any files on CephFS at any point between newfs and failure. >> Naturally, I tried initializing it again, but now, even after more than >> 5 tries, the "mds*"-objects simply no longer show up in the >> "metadata"-pool at all. In fact, it remains empty. I can mount CephFS >> after the first start of the MDS-cluster after a newfs, but on restart, >> it fails because of the missing objects. Am I doing anything wrong >> while initializing the cluster, maybe? Is cleaning the pools and doing >> the newfs enough? I did the same on the other cluster yesterday and it >> seems to have all objects. >> > > Thank you for your default information. > > The cause of missing object is that the MDS IDs for old FS and new FS > are the same (incarnations are the same). When OSD receives MDS > requests for the newly created FS. It silently drops the requests, > because it thinks they are duplicated. You can get around the bug by > creating new pools for the newfs. Thanks for this very useful info, I think this solves the mystery! Could I get around it any other way? I'd rather not have to re-create the pools and switch to new pool-ID's every time I have to do this. Does the OSD store this info in it's meta-data, or might restarting the OSDs be enough? I'm quite sure that I re-created MDS-clusters on the same pools many times, without all the objects going missing. This was usually as part of tests, where I also restarted other cluster-components, like OSDs. This could explain why only some files went missing. If some OSDs are restarted and processed the requests, while others dropped the requests, it would appear as if some, but not all objects are missing. The problem then persists until the active MDS in the MDS-cluster is restarted, after which the missing objects get noticed, because things fail to restart. IMHO, this is a bug. Why would the OSD ignore these requests, if the objects the MDS tries to write don't even exist at that time? Regards, Oliver _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com