Re: cephfs page cache

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Sep 2, 2016 at 11:35 AM, Sean Redmond <sean.redmond1@xxxxxxxxx> wrote:
> Hi,
>
> That makes sense, I have worked around this by forcing the sync within the
> application running under apache and it is working very well now without the
> need for the 'sync' mount option.
>
> What interesting is that in the pastebin provided below it shows a way to
> replicate this, I was just using a wget to download a file to the ceph file
> system instead of using apache to do the upload, just to simplify it, but
> maybe wget is also using memory-mapped IO.

That appears to be the case, yeah:
https://lists.gnu.org/archive/html/bug-wget/2013-09/msg00004.html

I'm starting to feel a little better. Glad you found a workaround. :)
-Greg


>
> http://pastebin.com/QK8AemAb
>
> Thanks
>
> On Fri, Sep 2, 2016 at 6:32 PM, Gregory Farnum <gfarnum@xxxxxxxxxx> wrote:
>>
>> On Thu, Sep 1, 2016 at 8:02 AM, Sean Redmond <sean.redmond1@xxxxxxxxx>
>> wrote:
>> > Hi,
>> >
>> > It seems to be using syscall mmap() from what I read this indicates it
>> > is
>> > using memory-mapped IO.
>> >
>> > Please see a strace here: http://pastebin.com/6wjhSNrP
>>
>> Zheng meant is Apache using memory-mapped IO. From a quick google it
>> does in some configurations, but I'm not sure how common it is.
>>
>> We ask because Ceph does not synchronize mmap IO for you and Apache
>> probably isn't doing it either; that would fit the symptoms you're
>> seeing. Regular buffered IO should not be exhibiting any of these
>> issues, although obviously we can't guarantee there are no bugs.
>> -Greg
>>
>> >
>> > Thanks
>> >
>> > On Wed, Aug 31, 2016 at 5:51 PM, Sean Redmond <sean.redmond1@xxxxxxxxx>
>> > wrote:
>> >>
>> >> I am not sure how to tell?
>> >>
>> >> Server1 and Server2 mount the ceph file system using kernel client
>> >> 4.7.2
>> >> and I can replicate the problem using '/usr/bin/sum' to read the file
>> >> or a
>> >> http GET request via a web server (apache).
>> >>
>> >> On Wed, Aug 31, 2016 at 2:38 PM, Yan, Zheng <ukernel@xxxxxxxxx> wrote:
>> >>>
>> >>> On Wed, Aug 31, 2016 at 12:49 AM, Sean Redmond
>> >>> <sean.redmond1@xxxxxxxxx>
>> >>> wrote:
>> >>> > Hi,
>> >>> >
>> >>> > I have been able to pick through the process a little further and
>> >>> > replicate
>> >>> > it via the command line. The flow seems looks like this:
>> >>> >
>> >>> > 1) The user uploads an image to webserver server 'uploader01' it
>> >>> > gets
>> >>> > written to a path such as
>> >>> > '/cephfs/webdata/static/456/JHL/66448H-755h.jpg'
>> >>> > on cephfs
>> >>> >
>> >>> > 2) The MDS makes the file meta data available for this new file
>> >>> > immediately
>> >>> > to all clients.
>> >>> >
>> >>> > 3) The 'uploader01' server asynchronously commits the file contents
>> >>> > to
>> >>> > disk
>> >>> > as sync is not explicitly called during the upload.
>> >>> >
>> >>> > 4) Before step 3 is done the visitor requests the file via one of
>> >>> > two
>> >>> > web
>> >>> > servers server1 or server2 - the MDS provides the meta data but the
>> >>> > contents
>> >>> > of the file is not committed to disk yet so the data read returns
>> >>> > 0's -
>> >>> > This
>> >>> > is then cached by the file system page cache until it expires or is
>> >>> > flushed
>> >>> > manually.
>> >>>
>> >>> do server1 or server2 use memory-mapped IO to read the file?
>> >>>
>> >>> Regards
>> >>> Yan, Zheng
>> >>>
>> >>> >
>> >>> > 5) As step 4 typically only happens on one of the two web servers
>> >>> > before
>> >>> > step 3 is complete we get the mismatch between server1 and server2
>> >>> > file
>> >>> > system page cache.
>> >>> >
>> >>> > The below demonstrates how to reproduce this issue
>> >>> >
>> >>> > http://pastebin.com/QK8AemAb
>> >>> >
>> >>> > As we can see the checksum of the file returned by the web server is
>> >>> > 0
>> >>> > as
>> >>> > the file contents has not been flushed to disk from server
>> >>> > uploader01
>> >>> >
>> >>> > If however we call ‘sync’ as shown below the checksum is correct:
>> >>> >
>> >>> > http://pastebin.com/p4CfhEFt
>> >>> >
>> >>> > If we also wait for 10 seconds for the kernel to flush the dirty
>> >>> > pages,
>> >>> > we
>> >>> > can also see the checksum is valid:
>> >>> >
>> >>> > http://pastebin.com/1w6UZzNQ
>> >>> >
>> >>> > It looks it maybe a race between the time it takes the uploader01
>> >>> > server to
>> >>> > commit the file to the file system and the fast incoming read
>> >>> > request
>> >>> > from
>> >>> > the visiting user to server1 or server2.
>> >>> >
>> >>> > Thanks
>> >>> >
>> >>> >
>> >>> > On Tue, Aug 30, 2016 at 10:21 AM, Sean Redmond
>> >>> > <sean.redmond1@xxxxxxxxx>
>> >>> > wrote:
>> >>> >>
>> >>> >> You are correct it only seems to impact recently modified files.
>> >>> >>
>> >>> >> On Tue, Aug 30, 2016 at 3:36 AM, Yan, Zheng <ukernel@xxxxxxxxx>
>> >>> >> wrote:
>> >>> >>>
>> >>> >>> On Tue, Aug 30, 2016 at 2:11 AM, Gregory Farnum
>> >>> >>> <gfarnum@xxxxxxxxxx>
>> >>> >>> wrote:
>> >>> >>> > On Mon, Aug 29, 2016 at 7:14 AM, Sean Redmond
>> >>> >>> > <sean.redmond1@xxxxxxxxx>
>> >>> >>> > wrote:
>> >>> >>> >> Hi,
>> >>> >>> >>
>> >>> >>> >> I am running cephfs (10.2.2) with kernel 4.7.0-1. I have
>> >>> >>> >> noticed
>> >>> >>> >> that
>> >>> >>> >> frequently static files are showing empty when serviced via a
>> >>> >>> >> web
>> >>> >>> >> server
>> >>> >>> >> (apache). I have tracked this down further and can see when
>> >>> >>> >> running a
>> >>> >>> >> checksum against the file on the cephfs file system on the node
>> >>> >>> >> serving the
>> >>> >>> >> empty http response the checksum is '00000'
>> >>> >>> >>
>> >>> >>> >> The below shows the checksum on a defective node.
>> >>> >>> >>
>> >>> >>> >> [root@server2]# ls -al
>> >>> >>> >> /cephfs/webdata/static/456/JHL/66448H-755h.jpg
>> >>> >>> >> -rw-r--r-- 1 apache apache 53317 Aug 28 23:46
>> >>> >>> >> /cephfs/webdata/static/456/JHL/66448H-755h.jpg
>> >>> >>>
>> >>> >>> It seems this file was modified recently. Maybe the web server
>> >>> >>> silently modifies the files. Please check if this issue happens on
>> >>> >>> older files.
>> >>> >>>
>> >>> >>> Regards
>> >>> >>> Yan, Zheng
>> >>> >>>
>> >>> >>> >>
>> >>> >>> >> [root@server2]# sum
>> >>> >>> >> /cephfs/webdata/static/456/JHL/66448H-755h.jpg
>> >>> >>> >> 00000    53
>> >>> >>> >
>> >>> >>> > So can we presume there are no file contents, and it's just 53
>> >>> >>> > blocks
>> >>> >>> > of zeros?
>> >>> >>> >
>> >>> >>> > This doesn't sound familiar to me; Zheng, do you have any ideas?
>> >>> >>> > Anyway, ceph-fuse shouldn't be susceptible to this bug even with
>> >>> >>> > the
>> >>> >>> > page cache enabled; if you're just serving stuff via the web
>> >>> >>> > it's
>> >>> >>> > probably a better idea anyway (harder to break, easier to
>> >>> >>> > update,
>> >>> >>> > etc).
>> >>> >>> > -Greg
>> >>> >>> >
>> >>> >>> >>
>> >>> >>> >> The below shows the checksum on a working node.
>> >>> >>> >>
>> >>> >>> >> [root@server1]# ls -al
>> >>> >>> >> /cephfs/webdata/static/456/JHL/66448H-755h.jpg
>> >>> >>> >> -rw-r--r-- 1 apache apache 53317 Aug 28 23:46
>> >>> >>> >> /cephfs/webdata/static/456/JHL/66448H-755h.jpg
>> >>> >>> >>
>> >>> >>> >> [root@server1]# sum
>> >>> >>> >> /cephfs/webdata/static/456/JHL/66448H-755h.jpg
>> >>> >>> >> 03620    53
>> >>> >>> >> [root@server1]#
>> >>> >>> >>
>> >>> >>> >> If I flush the cache as shown below the checksum returns as
>> >>> >>> >> expected
>> >>> >>> >> and the
>> >>> >>> >> web server serves up valid content.
>> >>> >>> >>
>> >>> >>> >> [root@server2]# echo 3 > /proc/sys/vm/drop_caches
>> >>> >>> >> [root@server2]# sum
>> >>> >>> >> /cephfs/webdata/static/456/JHL/66448H-755h.jpg
>> >>> >>> >> 03620    53
>> >>> >>> >>
>> >>> >>> >> After some time typically less than 1hr the issue repeats, It
>> >>> >>> >> seems to
>> >>> >>> >> not
>> >>> >>> >> repeat if I take any one of the servers out of the LB and only
>> >>> >>> >> serve
>> >>> >>> >> requests from one of the servers.
>> >>> >>> >>
>> >>> >>> >> I may try and use the FUSE client has has a mount option
>> >>> >>> >> direct_io
>> >>> >>> >> that
>> >>> >>> >> looks to disable page cache.
>> >>> >>> >>
>> >>> >>> >> I have been hunting in the ML and tracker but could not see
>> >>> >>> >> anything
>> >>> >>> >> really
>> >>> >>> >> close to this issue, Any input or feedback on similar
>> >>> >>> >> experiences
>> >>> >>> >> is
>> >>> >>> >> welcome.
>> >>> >>> >>
>> >>> >>> >> Thanks
>> >>> >>> >>
>> >>> >>> >>
>> >>> >>> >> _______________________________________________
>> >>> >>> >> ceph-users mailing list
>> >>> >>> >> ceph-users@xxxxxxxxxxxxxx
>> >>> >>> >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>> >>> >>> >>
>> >>> >>> > _______________________________________________
>> >>> >>> > ceph-users mailing list
>> >>> >>> > ceph-users@xxxxxxxxxxxxxx
>> >>> >>> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>> >>> >>
>> >>> >>
>> >>> >
>> >>
>> >>
>> >
>
>
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux