On Thu, Nov 23, 2017 at 5:49 PM, Andrey Klimentyev <andrey.klimentyev@xxxxxxxxx> wrote: > The workload is... really common. It's just a bunch of PHP scripts being > executed via php-fpm, that sometimes write a couple of files (some > e-commerce reports). There were concerns with mmap(2) being used, but it's > not the case, I've checked with strace. > I am using the kernel client with a relatively fresh kernel - 4.10.0-28. > > I think, the simplest thing to do, would be updating to Luminous, to be > honest. The problem is elusive and a PITA to resolve after it occurs. > "touch" does not work, I have to change the contents of a file to forcefully > synchronize it on every cephfs client. > does it use readahead(2), madvise(2) or fadvise(2)? 4.10 kernel does not include following commit https://github.com/ceph/ceph-client/commit/2b1ac852eb67a6e95595e576371d23519105559f > 20 нояб. 2017 г. 6:26 пользователь "Yan, Zheng" <ukernel@xxxxxxxxx> написал: > >> ceph-fuse or kernel client? which version of ceph-fuse/kernel? This >> issue can happen on ceph-fuse if fuse_disable_pagecache config is >> false. Old version kernel has a bug that can cause this issue. the bug >> is in splice_{read,write} and readahead code. >> >> On Sun, Nov 19, 2017 at 5:52 PM, Gregory Farnum <gfarnum@xxxxxxxxxx> >> wrote: >> > Hmm, are you mounting the filesystem using ceph-fuse? Can you describe >> > your >> > workload? >> > -Greg >> > >> > On Fri, Nov 3, 2017 at 6:42 PM Andrey Klimentyev >> > <andrey.klimentyev@xxxxxxxxx> wrote: >> >> >> >> I am absolutely incorrect, my apologies. >> >> >> >> caps: [mds] allow rw >> >> caps: [mon] allow r >> >> caps: [osd] allow rwx pool=cephfs_metadata, allow rwx pool=cephfs_data >> >> >> >> On 3 November 2017 at 10:40, Henrik Korkuc <lists@xxxxxxxxx> wrote: >> >>> >> >>> On 17-11-03 09:29, Andrey Klimentyev wrote: >> >>> >> >>> Thanks for a swift response. >> >>> >> >>> We are using 10.2.10. >> >>> >> >>> They all share the same set of permissions (and one key, too). Haven't >> >>> found anything incriminating in logs, too. >> >>> >> >>> caps: [mon] allow r >> >>> caps: [osd] allow class-read object_prefix rbd_children, allow rwx >> >>> pool=rbd >> >>> >> >>> Are you sure you pasted correct user permissions? It looks like you >> >>> are >> >>> using RBD permissions for CephFS and this seems to be the problem. >> >>> >> >>> On 3 November 2017 at 00:56, Gregory Farnum <gfarnum@xxxxxxxxxx> >> >>> wrote: >> >>>> >> >>>> On Thu, Nov 2, 2017 at 9:05 AM Andrey Klimentyev >> >>>> <andrey.klimentyev@xxxxxxxxx> wrote: >> >>>>> >> >>>>> Hi, >> >>>>> >> >>>>> we've recently hit a problem in a production cluster. The gist of it >> >>>>> is >> >>>>> that sometimes file will be changed on one machine, but only the >> >>>>> "change >> >>>>> time" would propagate to others. The checksum is different. >> >>>>> Contents, >> >>>>> obviously, differ as well. How can I debug this? >> >>>>> >> >>>>> In other words, how would I approach such problem with "stuck >> >>>>> files"? >> >>>>> Haven't found anything on Google or troubleshooting docs. >> >>>> >> >>>> >> >>>> What versions are you running? >> >>>> The only way I can think of this happening is if one of the clients >> >>>> had >> >>>> permission to access the CephFS namespace on the MDS, but not to >> >>>> write to >> >>>> the OSDs which store the file data. Have you checked that the clients >> >>>> all >> >>>> have the same caps? ("ceph auth list" or one of the related >> >>>> more-specific >> >>>> commands will let you compare.) >> >>>> -Greg >> >>>> >> >>>>> >> >>>>> >> >>>>> -- >> >>>>> Andrey Klimentyev, >> >>>>> DevOps engineer @ JSC «Flant» >> >>>>> http://flant.com/ >> >>>>> _______________________________________________ >> >>>>> ceph-users mailing list >> >>>>> ceph-users@xxxxxxxxxxxxxx >> >>>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >> >>> >> >>> >> >>> >> >>> >> >>> -- >> >>> Andrey Klimentyev, >> >>> DevOps engineer @ JSC «Flant» >> >>> http://flant.com/ >> >>> >> >>> >> >>> _______________________________________________ >> >>> ceph-users mailing list >> >>> ceph-users@xxxxxxxxxxxxxx >> >>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >> >>> >> >>> >> >> >> >> >> >> >> >> -- >> >> Andrey Klimentyev, >> >> DevOps engineer @ JSC «Flant» >> >> http://flant.com/ >> >> +7 (495) 721-10-27, ext. 487 >> >> +7 (960) 180-38-98 >> > >> > >> > _______________________________________________ >> > ceph-users mailing list >> > ceph-users@xxxxxxxxxxxxxx >> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >> > _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com