Thanks Greg. Perhaps this is a motivation for us to switch to ceph-fuse from the kernel client - at least that way, we could easily upgrade for bug fixes without waiting for a new kernel.
Chris
On Wed, Jan 28, 2015 at 9:32 AM, Gregory Farnum <greg@xxxxxxxxxxx> wrote:
This is in our testing branch and should go to Linus the next time we
send him stuff for merge. Unfortunately there's nobody doing CephFS
kernel backports at this time so you'll need to wait for that to come
out or spin your own. :(
-Greg
On Tue, Jan 27, 2015 at 10:46 AM, Christopher Armstrong
<chris@xxxxxxxxxxxx> wrote:
> Hey folks,
>
> Any update on this fix getting merged? We suspect other crashes based on
> this bug.
>
> Thanks,
>
> Chris
>
> On Tue, Jan 13, 2015 at 7:09 AM, Gregory Farnum <greg@xxxxxxxxxxx> wrote:
>>
>> Awesome, thanks for the bug report and the fix, guys. :)
>> -Greg
>>
>> On Mon, Jan 12, 2015 at 11:18 PM, 严正 <zyan@xxxxxxxxxx> wrote:
>> > I tracked down the bug. Please try the attached patch
>> >
>> > Regards
>> > Yan, Zheng
>> >
>> >
>> >
>> >
>> >> 在 2015年1月13日,07:40,Gregory Farnum <greg@xxxxxxxxxxx> 写道:
>> >>
>> >> Zheng, this looks like a kernel client issue to me, or else something
>> >> funny is going on with the cap flushing and the timestamps (note how
>> >> the reading client's ctime is set to an even second, while the mtime
>> >> is ~.63 seconds later and matches what the writing client sees). Any
>> >> ideas?
>> >> -Greg
>> >>
>> >> On Mon, Jan 12, 2015 at 12:19 PM, Lorieri <lorieri@xxxxxxxxx> wrote:
>> >>> Hi Gregory,
>> >>>
>> >>>
>> >>> $ uname -a
>> >>> Linux coreos2 3.17.7+ #2 SMP Tue Jan 6 08:22:04 UTC 2015 x86_64
>> >>> Intel(R) Xeon(R) CPU E5-4620 0 @ 2.20GHz GenuineIntel GNU/Linux
>> >>>
>> >>>
>> >>> Kernel Client, using `mount -t ceph ...`
>> >>>
>> >>>
>> >>> core@coreos2 /var/run/systemd/system $ modinfo ceph
>> >>> filename: /lib/modules/3.17.7+/kernel/fs/ceph/ceph.ko
>> >>> license: GPL
>> >>> description: Ceph filesystem for Linux
>> >>> author: Patience Warnick <patience@xxxxxxxxxxxx>
>> >>> author: Yehuda Sadeh <yehuda@xxxxxxxxxxxxxxx>
>> >>> author: Sage Weil <sage@xxxxxxxxxxxx>
>> >>> alias: fs-ceph
>> >>> depends: libceph
>> >>> intree: Y
>> >>> vermagic: 3.17.7+ SMP mod_unload
>> >>> signer: Magrathea: Glacier signing key
>> >>> sig_key:
>> >>> D4:BB:DE:E9:C6:D8:FC:90:9F:23:59:B2:19:1B:B8:FA:57:A1:AF:D2
>> >>> sig_hashalgo: sha256
>> >>>
>> >>> core@coreos2 /var/run/systemd/system $ modinfo libceph
>> >>> filename: /lib/modules/3.17.7+/kernel/net/ceph/libceph.ko
>> >>> license: GPL
>> >>> description: Ceph filesystem for Linux
>> >>> author: Patience Warnick <patience@xxxxxxxxxxxx>
>> >>> author: Yehuda Sadeh <yehuda@xxxxxxxxxxxxxxx>
>> >>> author: Sage Weil <sage@xxxxxxxxxxxx>
>> >>> depends: libcrc32c
>> >>> intree: Y
>> >>> vermagic: 3.17.7+ SMP mod_unload
>> >>> signer: Magrathea: Glacier signing key
>> >>> sig_key:
>> >>> D4:BB:DE:E9:C6:D8:FC:90:9F:23:59:B2:19:1B:B8:FA:57:A1:AF:D2
>> >>> sig_hashalgo: sha256
>> >>>
>> >>>
>> >>>
>> >>> ceph is installed on a ubuntu containers (same kernel):
>> >>>
>> >>> $ dpkg -l |grep ceph
>> >>>
>> >>> ii ceph 0.87-1trusty
>> >>> amd64 distributed storage and file system
>> >>> ii ceph-common 0.87-1trusty
>> >>> amd64 common utilities to mount and interact with a ceph
>> >>> storage cluster
>> >>> ii ceph-fs-common 0.87-1trusty
>> >>> amd64 common utilities to mount and interact with a ceph file
>> >>> system
>> >>> ii ceph-fuse 0.87-1trusty
>> >>> amd64 FUSE-based client for the Ceph distributed file system
>> >>> ii ceph-mds 0.87-1trusty
>> >>> amd64 metadata server for the ceph distributed file system
>> >>> ii libcephfs1 0.87-1trusty
>> >>> amd64 Ceph distributed file system client library
>> >>> ii python-ceph 0.87-1trusty
>> >>> amd64 Python libraries for the Ceph distributed filesystem
>> >>>
>> >>>
>> >>>
>> >>> Reproducing the error:
>> >>>
>> >>> at machine 1:
>> >>> core@coreos1 /var/lib/deis/store/logs $ > test.log
>> >>> core@coreos1 /var/lib/deis/store/logs $ echo 1 > test.log
>> >>> core@coreos1 /var/lib/deis/store/logs $ stat test.log
>> >>> File: 'test.log'
>> >>> Size: 2 Blocks: 1 IO Block: 4194304 regular file
>> >>> Device: 0h/0d Inode: 1099511629882 Links: 1
>> >>> Access: (0644/-rw-r--r--) Uid: ( 500/ core) Gid: ( 500/
>> >>> core)
>> >>> Access: 2015-01-12 20:05:03.000000000 +0000
>> >>> Modify: 2015-01-12 20:06:09.637234229 +0000
>> >>> Change: 2015-01-12 20:06:09.637234229 +0000
>> >>> Birth: -
>> >>>
>> >>> at machine 2:
>> >>> core@coreos2 /var/lib/deis/store/logs $ stat test.log
>> >>> File: 'test.log'
>> >>> Size: 2 Blocks: 1 IO Block: 4194304 regular file
>> >>> Device: 0h/0d Inode: 1099511629882 Links: 1
>> >>> Access: (0644/-rw-r--r--) Uid: ( 500/ core) Gid: ( 500/
>> >>> core)
>> >>> Access: 2015-01-12 20:05:03.000000000 +0000
>> >>> Modify: 2015-01-12 20:06:09.637234229 +0000
>> >>> Change: 2015-01-12 20:06:09.000000000 +0000
>> >>> Birth: -
>> >>>
>> >>>
>> >>> Change time is not updated making some tail libs to not show new
>> >>> content until you force the change time be updated, like running a
>> >>> "touch" in the file.
>> >>> Some tools freeze and trigger other issues in the system.
>> >>>
>> >>>
>> >>> Tests, all in the machine #2:
>> >>>
>> >>> FAILED -> https://github.com/ActiveState/tail
>> >>> FAILED -> /usr/bin/tail of a Google docker image running debian wheezy
>> >>> PASSED -> /usr/bin/tail of a ubuntu 14.04 docker image
>> >>> PASSED -> /usr/bin/tail of the coreos release 494.5.0
>> >>>
>> >>>
>> >>> Tests in machine #1 (same machine that is writing the file) all tests
>> >>> pass.
>> >>>
>> >>>
>> >>>
>> >>> On Mon, Jan 12, 2015 at 5:14 PM, Gregory Farnum <greg@xxxxxxxxxxx>
>> >>> wrote:
>> >>>> What versions of all the Ceph pieces are you using? (Kernel
>> >>>> client/ceph-fuse, MDS, etc)
>> >>>>
>> >>>> Can you provide more details on exactly what the program is doing on
>> >>>> which nodes?
>> >>>> -Greg
>> >>>>
>> >>>> On Fri, Jan 9, 2015 at 5:15 PM, Lorieri <lorieri@xxxxxxxxx> wrote:
>> >>>>> first 3 stat commands shows blocks and size changing, but not the
>> >>>>> times
>> >>>>> after a touch it changes and tail works
>> >>>>>
>> >>>>> I saw some cephfs freezes related to it, it came back after touching
>> >>>>> the files
>> >>>>>
>> >>>>> coreos2 logs # stat deis-router.log
>> >>>>> File: 'deis-router.log'
>> >>>>> Size: 148564 Blocks: 291 IO Block: 4194304 regular file
>> >>>>> Device: 0h/0d Inode: 1099511628780 Links: 1
>> >>>>> Access: (0644/-rw-r--r--) Uid: ( 0/ root) Gid: ( 0/
>> >>>>> root)
>> >>>>> Access: 2015-01-10 01:13:00.100582619 +0000
>> >>>>> Modify: 2015-01-10 01:13:00.100582619 +0000
>> >>>>> Change: 2015-01-10 01:13:00.000000000 +0000
>> >>>>> Birth: -
>> >>>>> coreos2 logs # stat deis-router.log
>> >>>>> File: 'deis-router.log'
>> >>>>> Size: 152633 Blocks: 299 IO Block: 4194304 regular file
>> >>>>> Device: 0h/0d Inode: 1099511628780 Links: 1
>> >>>>> Access: (0644/-rw-r--r--) Uid: ( 0/ root) Gid: ( 0/
>> >>>>> root)
>> >>>>> Access: 2015-01-10 01:13:00.100582619 +0000
>> >>>>> Modify: 2015-01-10 01:13:00.100582619 +0000
>> >>>>> Change: 2015-01-10 01:13:00.000000000 +0000
>> >>>>> Birth: -
>> >>>>> coreos2 logs # stat deis-router.log
>> >>>>> File: 'deis-router.log'
>> >>>>> Size: 155763 Blocks: 305 IO Block: 4194304 regular file
>> >>>>> Device: 0h/0d Inode: 1099511628780 Links: 1
>> >>>>> Access: (0644/-rw-r--r--) Uid: ( 0/ root) Gid: ( 0/
>> >>>>> root)
>> >>>>> Access: 2015-01-10 01:13:00.100582619 +0000
>> >>>>> Modify: 2015-01-10 01:13:00.100582619 +0000
>> >>>>> Change: 2015-01-10 01:13:00.000000000 +0000
>> >>>>> Birth: -
>> >>>>>
>> >>>>> coreos2 logs # touch deis-router.log
>> >>>>>
>> >>>>> coreos2 logs # stat deis-router.log
>> >>>>> File: 'deis-router.log'
>> >>>>> Size: 155763 Blocks: 305 IO Block: 4194304 regular file
>> >>>>> Device: 0h/0d Inode: 1099511628780 Links: 1
>> >>>>> Access: (0644/-rw-r--r--) Uid: ( 0/ root) Gid: ( 0/
>> >>>>> root)
>> >>>>> Access: 2015-01-10 01:13:46.961858103 +0000
>> >>>>> Modify: 2015-01-10 01:13:46.961858103 +0000
>> >>>>> Change: 2015-01-10 01:13:46.000000000 +0000
>> >>>>> Birth: -
>> >>>>>
>> >>>>> On Fri, Jan 9, 2015 at 11:11 PM, Lorieri <lorieri@xxxxxxxxx> wrote:
>> >>>>>> Hi,
>> >>>>>>
>> >>>>>> I have a program that tails a file and this file is create on
>> >>>>>> another machine
>> >>>>>>
>> >>>>>> some tail programs does not work because the modification time is
>> >>>>>> not
>> >>>>>> updated in the remote machines
>> >>>>>>
>> >>>>>> I've find this old thread
>> >>>>>> http://permalink.gmane.org/gmane.comp.file-systems.ceph.devel/11001
>> >>>>>>
>> >>>>>> it mentions the problem and suggest ntp sync
>> >>>>>>
>> >>>>>> I tried to re-sync ntp and restart the ceph cluster, but the issue
>> >>>>>> persists
>> >>>>>>
>> >>>>>> do you know if it is possible to avoid this behavior ?
>> >>>>>>
>> >>>>>> thanks
>> >>>>>> -lorieri
>> >>>>> _______________________________________________
>> >>>>> ceph-users mailing list
>> >>>>> ceph-users@xxxxxxxxxxxxxx
>> >>>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>> >
>> >
>> _______________________________________________
>> ceph-users mailing list
>> ceph-users@xxxxxxxxxxxxxx
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com