Okay, this has happened several more times. Always seems to be a small file that should be read-only (perhaps simultaneously) on many different clients. It is just through the cephfs interface that the files are corrupted, the objects in the cachepool and erasure coded pool are still correct. I am beginning to doubt these files are getting a truncation request. Twice now have been different perl files, once was someones .bashrc, once was an input file for another application, timestamps on the files indicate that the files haven't been modified in weeks. Any other possibilites? Or any way to figure out what happened? -- Adam On Sun, Sep 27, 2015 at 10:44 PM, Adam Tygart <mozes@xxxxxxx> wrote: > I've done some digging into cp and mv's semantics (from coreutils). If > the inode is existing, the file will get truncated, then data will get > copied in. This is definitely within the scope of the bug above. > > -- > Adam > > On Fri, Sep 25, 2015 at 8:08 PM, Adam Tygart <mozes@xxxxxxx> wrote: >> It may have been. Although the timestamp on the file was almost a >> month ago. The typical workflow for this particular file is to copy an >> updated version overtop of it. >> >> i.e. 'cp qss kstat' >> >> I'm not sure if cp semantics would keep the same inode and simply >> truncate/overwrite the contents, or if it would do an unlink and then >> create a new file. >> -- >> Adam >> >> On Fri, Sep 25, 2015 at 8:00 PM, Ivo Jimenez <ivo@xxxxxxxxxxx> wrote: >>> Looks like you might be experiencing this bug: >>> >>> http://tracker.ceph.com/issues/12551 >>> >>> Fix has been merged to master and I believe it'll be part of infernalis. The >>> original reproducer involved truncating/overwriting files. In your example, >>> do you know if 'kstat' has been truncated/overwritten prior to generating >>> the md5sums? >>> >>> On Fri, Sep 25, 2015 at 2:11 PM Adam Tygart <mozes@xxxxxxx> wrote: >>>> >>>> Hello all, >>>> >>>> I've run into some sort of bug with CephFS. Client reads of a >>>> particular file return nothing but 40KB of Null bytes. Doing a rados >>>> level get of the inode returns the whole file, correctly. >>>> >>>> Tested via Linux 4.1, 4.2 kernel clients, and the 0.94.3 fuse client. >>>> >>>> Attached is a dynamic printk debug of the ceph module from the linux >>>> 4.2 client while cat'ing the file. >>>> >>>> My current thought is that there has to be a cache of the object >>>> *somewhere* that a 'rados get' bypasses. >>>> >>>> Even on hosts that have *never* read the file before, it is returning >>>> Null bytes from the kernel and fuse mounts. >>>> >>>> Background: >>>> >>>> 24x CentOS 7.1 hosts serving up RBD and CephFS with Ceph 0.94.3. >>>> CephFS is a EC k=8, m=4 pool with a size 3 writeback cache in front of it. >>>> >>>> # rados -p cachepool get 10004096b95.00000000 /tmp/kstat-cache >>>> # rados -p ec84pool get 10004096b95.00000000 /tmp/kstat-ec >>>> # md5sum /tmp/kstat* >>>> ddfbe886420a2cb860b46dc70f4f9a0d /tmp/kstat-cache >>>> ddfbe886420a2cb860b46dc70f4f9a0d /tmp/kstat-ec >>>> # file /tmp/kstat* >>>> /tmp/kstat-cache: Perl script, ASCII text executable >>>> /tmp/kstat-ec: Perl script, ASCII text executable >>>> >>>> # md5sum ~daveturner/bin/kstat >>>> 1914e941c2ad5245a23e3e1d27cf8fde /homes/daveturner/bin/kstat >>>> # file ~daveturner/bin/kstat >>>> /homes/daveturner/bin/kstat: data >>>> >>>> Thoughts? >>>> >>>> Any more information you need? >>>> >>>> -- >>>> Adam >>>> _______________________________________________ >>>> ceph-users mailing list >>>> ceph-users@xxxxxxxxxxxxxx >>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com