Zheng, this looks like a kernel client issue to me, or else something funny is going on with the cap flushing and the timestamps (note how the reading client's ctime is set to an even second, while the mtime is ~.63 seconds later and matches what the writing client sees). Any ideas? -Greg On Mon, Jan 12, 2015 at 12:19 PM, Lorieri <lorieri@xxxxxxxxx> wrote: > Hi Gregory, > > > $ uname -a > Linux coreos2 3.17.7+ #2 SMP Tue Jan 6 08:22:04 UTC 2015 x86_64 > Intel(R) Xeon(R) CPU E5-4620 0 @ 2.20GHz GenuineIntel GNU/Linux > > > Kernel Client, using `mount -t ceph ...` > > > core@coreos2 /var/run/systemd/system $ modinfo ceph > filename: /lib/modules/3.17.7+/kernel/fs/ceph/ceph.ko > license: GPL > description: Ceph filesystem for Linux > author: Patience Warnick <patience@xxxxxxxxxxxx> > author: Yehuda Sadeh <yehuda@xxxxxxxxxxxxxxx> > author: Sage Weil <sage@xxxxxxxxxxxx> > alias: fs-ceph > depends: libceph > intree: Y > vermagic: 3.17.7+ SMP mod_unload > signer: Magrathea: Glacier signing key > sig_key: D4:BB:DE:E9:C6:D8:FC:90:9F:23:59:B2:19:1B:B8:FA:57:A1:AF:D2 > sig_hashalgo: sha256 > > core@coreos2 /var/run/systemd/system $ modinfo libceph > filename: /lib/modules/3.17.7+/kernel/net/ceph/libceph.ko > license: GPL > description: Ceph filesystem for Linux > author: Patience Warnick <patience@xxxxxxxxxxxx> > author: Yehuda Sadeh <yehuda@xxxxxxxxxxxxxxx> > author: Sage Weil <sage@xxxxxxxxxxxx> > depends: libcrc32c > intree: Y > vermagic: 3.17.7+ SMP mod_unload > signer: Magrathea: Glacier signing key > sig_key: D4:BB:DE:E9:C6:D8:FC:90:9F:23:59:B2:19:1B:B8:FA:57:A1:AF:D2 > sig_hashalgo: sha256 > > > > ceph is installed on a ubuntu containers (same kernel): > > $ dpkg -l |grep ceph > > ii ceph 0.87-1trusty > amd64 distributed storage and file system > ii ceph-common 0.87-1trusty > amd64 common utilities to mount and interact with a ceph > storage cluster > ii ceph-fs-common 0.87-1trusty > amd64 common utilities to mount and interact with a ceph file > system > ii ceph-fuse 0.87-1trusty > amd64 FUSE-based client for the Ceph distributed file system > ii ceph-mds 0.87-1trusty > amd64 metadata server for the ceph distributed file system > ii libcephfs1 0.87-1trusty > amd64 Ceph distributed file system client library > ii python-ceph 0.87-1trusty > amd64 Python libraries for the Ceph distributed filesystem > > > > Reproducing the error: > > at machine 1: > core@coreos1 /var/lib/deis/store/logs $ > test.log > core@coreos1 /var/lib/deis/store/logs $ echo 1 > test.log > core@coreos1 /var/lib/deis/store/logs $ stat test.log > File: 'test.log' > Size: 2 Blocks: 1 IO Block: 4194304 regular file > Device: 0h/0d Inode: 1099511629882 Links: 1 > Access: (0644/-rw-r--r--) Uid: ( 500/ core) Gid: ( 500/ core) > Access: 2015-01-12 20:05:03.000000000 +0000 > Modify: 2015-01-12 20:06:09.637234229 +0000 > Change: 2015-01-12 20:06:09.637234229 +0000 > Birth: - > > at machine 2: > core@coreos2 /var/lib/deis/store/logs $ stat test.log > File: 'test.log' > Size: 2 Blocks: 1 IO Block: 4194304 regular file > Device: 0h/0d Inode: 1099511629882 Links: 1 > Access: (0644/-rw-r--r--) Uid: ( 500/ core) Gid: ( 500/ core) > Access: 2015-01-12 20:05:03.000000000 +0000 > Modify: 2015-01-12 20:06:09.637234229 +0000 > Change: 2015-01-12 20:06:09.000000000 +0000 > Birth: - > > > Change time is not updated making some tail libs to not show new > content until you force the change time be updated, like running a > "touch" in the file. > Some tools freeze and trigger other issues in the system. > > > Tests, all in the machine #2: > > FAILED -> https://github.com/ActiveState/tail > FAILED -> /usr/bin/tail of a Google docker image running debian wheezy > PASSED -> /usr/bin/tail of a ubuntu 14.04 docker image > PASSED -> /usr/bin/tail of the coreos release 494.5.0 > > > Tests in machine #1 (same machine that is writing the file) all tests pass. > > > > On Mon, Jan 12, 2015 at 5:14 PM, Gregory Farnum <greg@xxxxxxxxxxx> wrote: >> What versions of all the Ceph pieces are you using? (Kernel >> client/ceph-fuse, MDS, etc) >> >> Can you provide more details on exactly what the program is doing on >> which nodes? >> -Greg >> >> On Fri, Jan 9, 2015 at 5:15 PM, Lorieri <lorieri@xxxxxxxxx> wrote: >>> first 3 stat commands shows blocks and size changing, but not the times >>> after a touch it changes and tail works >>> >>> I saw some cephfs freezes related to it, it came back after touching the files >>> >>> coreos2 logs # stat deis-router.log >>> File: 'deis-router.log' >>> Size: 148564 Blocks: 291 IO Block: 4194304 regular file >>> Device: 0h/0d Inode: 1099511628780 Links: 1 >>> Access: (0644/-rw-r--r--) Uid: ( 0/ root) Gid: ( 0/ root) >>> Access: 2015-01-10 01:13:00.100582619 +0000 >>> Modify: 2015-01-10 01:13:00.100582619 +0000 >>> Change: 2015-01-10 01:13:00.000000000 +0000 >>> Birth: - >>> coreos2 logs # stat deis-router.log >>> File: 'deis-router.log' >>> Size: 152633 Blocks: 299 IO Block: 4194304 regular file >>> Device: 0h/0d Inode: 1099511628780 Links: 1 >>> Access: (0644/-rw-r--r--) Uid: ( 0/ root) Gid: ( 0/ root) >>> Access: 2015-01-10 01:13:00.100582619 +0000 >>> Modify: 2015-01-10 01:13:00.100582619 +0000 >>> Change: 2015-01-10 01:13:00.000000000 +0000 >>> Birth: - >>> coreos2 logs # stat deis-router.log >>> File: 'deis-router.log' >>> Size: 155763 Blocks: 305 IO Block: 4194304 regular file >>> Device: 0h/0d Inode: 1099511628780 Links: 1 >>> Access: (0644/-rw-r--r--) Uid: ( 0/ root) Gid: ( 0/ root) >>> Access: 2015-01-10 01:13:00.100582619 +0000 >>> Modify: 2015-01-10 01:13:00.100582619 +0000 >>> Change: 2015-01-10 01:13:00.000000000 +0000 >>> Birth: - >>> >>> coreos2 logs # touch deis-router.log >>> >>> coreos2 logs # stat deis-router.log >>> File: 'deis-router.log' >>> Size: 155763 Blocks: 305 IO Block: 4194304 regular file >>> Device: 0h/0d Inode: 1099511628780 Links: 1 >>> Access: (0644/-rw-r--r--) Uid: ( 0/ root) Gid: ( 0/ root) >>> Access: 2015-01-10 01:13:46.961858103 +0000 >>> Modify: 2015-01-10 01:13:46.961858103 +0000 >>> Change: 2015-01-10 01:13:46.000000000 +0000 >>> Birth: - >>> >>> On Fri, Jan 9, 2015 at 11:11 PM, Lorieri <lorieri@xxxxxxxxx> wrote: >>>> Hi, >>>> >>>> I have a program that tails a file and this file is create on another machine >>>> >>>> some tail programs does not work because the modification time is not >>>> updated in the remote machines >>>> >>>> I've find this old thread >>>> http://permalink.gmane.org/gmane.comp.file-systems.ceph.devel/11001 >>>> >>>> it mentions the problem and suggest ntp sync >>>> >>>> I tried to re-sync ntp and restart the ceph cluster, but the issue persists >>>> >>>> do you know if it is possible to avoid this behavior ? >>>> >>>> thanks >>>> -lorieri >>> _______________________________________________ >>> ceph-users mailing list >>> ceph-users@xxxxxxxxxxxxxx >>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com