Hey folks,
Any update on this fix getting merged? We suspect other crashes based on this bug.
Thanks,
Chris
On Tue, Jan 13, 2015 at 7:09 AM, Gregory Farnum <greg@xxxxxxxxxxx> wrote:
Awesome, thanks for the bug report and the fix, guys. :)
-Greg
>> 在 2015年1月13日,07:40,Gregory Farnum <greg@xxxxxxxxxxx> 写道:
On Mon, Jan 12, 2015 at 11:18 PM, 严正 <zyan@xxxxxxxxxx> wrote:
> I tracked down the bug. Please try the attached patch
>
> Regards
> Yan, Zheng
>
>
>
>
>>
>> Zheng, this looks like a kernel client issue to me, or else something
>> funny is going on with the cap flushing and the timestamps (note how
>> the reading client's ctime is set to an even second, while the mtime
>> is ~.63 seconds later and matches what the writing client sees). Any
>> ideas?
>> -Greg
>>
>> On Mon, Jan 12, 2015 at 12:19 PM, Lorieri <lorieri@xxxxxxxxx> wrote:
>>> Hi Gregory,
>>>
>>>
>>> $ uname -a
>>> Linux coreos2 3.17.7+ #2 SMP Tue Jan 6 08:22:04 UTC 2015 x86_64
>>> Intel(R) Xeon(R) CPU E5-4620 0 @ 2.20GHz GenuineIntel GNU/Linux
>>>
>>>
>>> Kernel Client, using `mount -t ceph ...`
>>>
>>>
>>> core@coreos2 /var/run/systemd/system $ modinfo ceph
>>> filename: /lib/modules/3.17.7+/kernel/fs/ceph/ceph.ko
>>> license: GPL
>>> description: Ceph filesystem for Linux
>>> author: Patience Warnick <patience@xxxxxxxxxxxx>
>>> author: Yehuda Sadeh <yehuda@xxxxxxxxxxxxxxx>
>>> author: Sage Weil <sage@xxxxxxxxxxxx>
>>> alias: fs-ceph
>>> depends: libceph
>>> intree: Y
>>> vermagic: 3.17.7+ SMP mod_unload
>>> signer: Magrathea: Glacier signing key
>>> sig_key: D4:BB:DE:E9:C6:D8:FC:90:9F:23:59:B2:19:1B:B8:FA:57:A1:AF:D2
>>> sig_hashalgo: sha256
>>>
>>> core@coreos2 /var/run/systemd/system $ modinfo libceph
>>> filename: /lib/modules/3.17.7+/kernel/net/ceph/libceph.ko
>>> license: GPL
>>> description: Ceph filesystem for Linux
>>> author: Patience Warnick <patience@xxxxxxxxxxxx>
>>> author: Yehuda Sadeh <yehuda@xxxxxxxxxxxxxxx>
>>> author: Sage Weil <sage@xxxxxxxxxxxx>
>>> depends: libcrc32c
>>> intree: Y
>>> vermagic: 3.17.7+ SMP mod_unload
>>> signer: Magrathea: Glacier signing key
>>> sig_key: D4:BB:DE:E9:C6:D8:FC:90:9F:23:59:B2:19:1B:B8:FA:57:A1:AF:D2
>>> sig_hashalgo: sha256
>>>
>>>
>>>
>>> ceph is installed on a ubuntu containers (same kernel):
>>>
>>> $ dpkg -l |grep ceph
>>>
>>> ii ceph 0.87-1trusty
>>> amd64 distributed storage and file system
>>> ii ceph-common 0.87-1trusty
>>> amd64 common utilities to mount and interact with a ceph
>>> storage cluster
>>> ii ceph-fs-common 0.87-1trusty
>>> amd64 common utilities to mount and interact with a ceph file
>>> system
>>> ii ceph-fuse 0.87-1trusty
>>> amd64 FUSE-based client for the Ceph distributed file system
>>> ii ceph-mds 0.87-1trusty
>>> amd64 metadata server for the ceph distributed file system
>>> ii libcephfs1 0.87-1trusty
>>> amd64 Ceph distributed file system client library
>>> ii python-ceph 0.87-1trusty
>>> amd64 Python libraries for the Ceph distributed filesystem
>>>
>>>
>>>
>>> Reproducing the error:
>>>
>>> at machine 1:
>>> core@coreos1 /var/lib/deis/store/logs $ > test.log
>>> core@coreos1 /var/lib/deis/store/logs $ echo 1 > test.log
>>> core@coreos1 /var/lib/deis/store/logs $ stat test.log
>>> File: 'test.log'
>>> Size: 2 Blocks: 1 IO Block: 4194304 regular file
>>> Device: 0h/0d Inode: 1099511629882 Links: 1
>>> Access: (0644/-rw-r--r--) Uid: ( 500/ core) Gid: ( 500/ core)
>>> Access: 2015-01-12 20:05:03.000000000 +0000
>>> Modify: 2015-01-12 20:06:09.637234229 +0000
>>> Change: 2015-01-12 20:06:09.637234229 +0000
>>> Birth: -
>>>
>>> at machine 2:
>>> core@coreos2 /var/lib/deis/store/logs $ stat test.log
>>> File: 'test.log'
>>> Size: 2 Blocks: 1 IO Block: 4194304 regular file
>>> Device: 0h/0d Inode: 1099511629882 Links: 1
>>> Access: (0644/-rw-r--r--) Uid: ( 500/ core) Gid: ( 500/ core)
>>> Access: 2015-01-12 20:05:03.000000000 +0000
>>> Modify: 2015-01-12 20:06:09.637234229 +0000
>>> Change: 2015-01-12 20:06:09.000000000 +0000
>>> Birth: -
>>>
>>>
>>> Change time is not updated making some tail libs to not show new
>>> content until you force the change time be updated, like running a
>>> "touch" in the file.
>>> Some tools freeze and trigger other issues in the system.
>>>
>>>
>>> Tests, all in the machine #2:
>>>
>>> FAILED -> https://github.com/ActiveState/tail
>>> FAILED -> /usr/bin/tail of a Google docker image running debian wheezy
>>> PASSED -> /usr/bin/tail of a ubuntu 14.04 docker image
>>> PASSED -> /usr/bin/tail of the coreos release 494.5.0
>>>
>>>
>>> Tests in machine #1 (same machine that is writing the file) all tests pass.
>>>
>>>
>>>
>>> On Mon, Jan 12, 2015 at 5:14 PM, Gregory Farnum <greg@xxxxxxxxxxx> wrote:
>>>> What versions of all the Ceph pieces are you using? (Kernel
>>>> client/ceph-fuse, MDS, etc)
>>>>
>>>> Can you provide more details on exactly what the program is doing on
>>>> which nodes?
>>>> -Greg
>>>>
>>>> On Fri, Jan 9, 2015 at 5:15 PM, Lorieri <lorieri@xxxxxxxxx> wrote:
>>>>> first 3 stat commands shows blocks and size changing, but not the times
>>>>> after a touch it changes and tail works
>>>>>
>>>>> I saw some cephfs freezes related to it, it came back after touching the files
>>>>>
>>>>> coreos2 logs # stat deis-router.log
>>>>> File: 'deis-router.log'
>>>>> Size: 148564 Blocks: 291 IO Block: 4194304 regular file
>>>>> Device: 0h/0d Inode: 1099511628780 Links: 1
>>>>> Access: (0644/-rw-r--r--) Uid: ( 0/ root) Gid: ( 0/ root)
>>>>> Access: 2015-01-10 01:13:00.100582619 +0000
>>>>> Modify: 2015-01-10 01:13:00.100582619 +0000
>>>>> Change: 2015-01-10 01:13:00.000000000 +0000
>>>>> Birth: -
>>>>> coreos2 logs # stat deis-router.log
>>>>> File: 'deis-router.log'
>>>>> Size: 152633 Blocks: 299 IO Block: 4194304 regular file
>>>>> Device: 0h/0d Inode: 1099511628780 Links: 1
>>>>> Access: (0644/-rw-r--r--) Uid: ( 0/ root) Gid: ( 0/ root)
>>>>> Access: 2015-01-10 01:13:00.100582619 +0000
>>>>> Modify: 2015-01-10 01:13:00.100582619 +0000
>>>>> Change: 2015-01-10 01:13:00.000000000 +0000
>>>>> Birth: -
>>>>> coreos2 logs # stat deis-router.log
>>>>> File: 'deis-router.log'
>>>>> Size: 155763 Blocks: 305 IO Block: 4194304 regular file
>>>>> Device: 0h/0d Inode: 1099511628780 Links: 1
>>>>> Access: (0644/-rw-r--r--) Uid: ( 0/ root) Gid: ( 0/ root)
>>>>> Access: 2015-01-10 01:13:00.100582619 +0000
>>>>> Modify: 2015-01-10 01:13:00.100582619 +0000
>>>>> Change: 2015-01-10 01:13:00.000000000 +0000
>>>>> Birth: -
>>>>>
>>>>> coreos2 logs # touch deis-router.log
>>>>>
>>>>> coreos2 logs # stat deis-router.log
>>>>> File: 'deis-router.log'
>>>>> Size: 155763 Blocks: 305 IO Block: 4194304 regular file
>>>>> Device: 0h/0d Inode: 1099511628780 Links: 1
>>>>> Access: (0644/-rw-r--r--) Uid: ( 0/ root) Gid: ( 0/ root)
>>>>> Access: 2015-01-10 01:13:46.961858103 +0000
>>>>> Modify: 2015-01-10 01:13:46.961858103 +0000
>>>>> Change: 2015-01-10 01:13:46.000000000 +0000
>>>>> Birth: -
>>>>>
>>>>> On Fri, Jan 9, 2015 at 11:11 PM, Lorieri <lorieri@xxxxxxxxx> wrote:
>>>>>> Hi,
>>>>>>
>>>>>> I have a program that tails a file and this file is create on another machine
>>>>>>
>>>>>> some tail programs does not work because the modification time is not
>>>>>> updated in the remote machines
>>>>>>
>>>>>> I've find this old thread
>>>>>> http://permalink.gmane.org/gmane.comp.file-systems.ceph.devel/11001
>>>>>>
>>>>>> it mentions the problem and suggest ntp sync
>>>>>>
>>>>>> I tried to re-sync ntp and restart the ceph cluster, but the issue persists
>>>>>>
>>>>>> do you know if it is possible to avoid this behavior ?
>>>>>>
>>>>>> thanks
>>>>>> -lorieri
>>>>> _______________________________________________
>>>>> ceph-users mailing list
>>>>> ceph-users@xxxxxxxxxxxxxx
>>>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com