On Sun, Feb 10, 2013 at 11:34 AM, Tobias Prousa <topro@xxxxxx> wrote: > Hi Greg, > >> On Friday, February 8, 2013 at 6:24 AM, Tobias Prousa wrote: >> > After some more digging and IRC discussion the issue with truncated >> files seems to be related to me having mounted cephfs on multiple clients >> concurrently, which in turn cephfs seems to not handle very well at the moment. >> As soon as there is only on client mounting that ceph filesystem, the >> problem is gone. >> > >> > There is said to be patches around which will improve that situation but >> those haven't even hit repositories, so be patient. >> > >> > Btw. that result is maybe as important as atm. staying away from >> multiple active MDSs, so maybe a little note on docs could help other brave >> testers to not experience the same issues. >> Yikes! This sounds a bit different than the issues I've heard about so >> far. Among other things, unless you have daemon restarts that you aren't >> describing here, this sounds like it's a kernel client bug. >> Can you >> 1) Try using ceph-fuse and see if the problem manifests there? >> If it does, >> 2) Turn on MDS and OSD debugging and reproduce the problem in a new >> directory? > > well, I would first have to get a bit used to fuse as I never used it before. If I manage to reproduce it with fuse I'll give you the debugging logs. If I don't manage to reproduce I still wouldn't be 100% sure it's a kernel client issue as I still don't know what detail of my use-case triggers the issue and I would fur sure have to do slight modification the setup a little (client side) to use fuse in the first place. I'll let you know asap. If this is a big deal we can skip to the server-side logs. It's just that my guess is the kernel client isn't flushing all the data out of cache and onto the OSDs — not that data is being lost once it gets into RADOS or the MDS. > On the other hand, could all this be caused by me having one x86-kernel client and one amd64-kernel client mounting that cephfs at the same time? On both clients its the same linux-3.7.3 kernel from debian-experimental, just one i686, the other amd64. I believe I read something like that reported by some else, somewhere. I don't think so. There was a potential 32/64-bit bug recently (http://tracker.ceph.com/issues/3935), but it was a hang, not a data loss of any kind. -Greg _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com