You are correct it only seems to impact recently modified files.
On Tue, Aug 30, 2016 at 3:36 AM, Yan, Zheng <ukernel@xxxxxxxxx> wrote:
On Tue, Aug 30, 2016 at 2:11 AM, Gregory Farnum <gfarnum@xxxxxxxxxx> wrote:
> On Mon, Aug 29, 2016 at 7:14 AM, Sean Redmond <sean.redmond1@xxxxxxxxx> wrote:
>> Hi,
>>
>> I am running cephfs (10.2.2) with kernel 4.7.0-1. I have noticed that
>> frequently static files are showing empty when serviced via a web server
>> (apache). I have tracked this down further and can see when running a
>> checksum against the file on the cephfs file system on the node serving the
>> empty http response the checksum is '00000'
>>
>> The below shows the checksum on a defective node.
>>
>> [root@server2]# ls -al /cephfs/webdata/static/456/JHL/66448H-755h.jpg It seems this file was modified recently. Maybe the web server
>> -rw-r--r-- 1 apache apache 53317 Aug 28 23:46
>> /cephfs/webdata/static/456/JHL/66448H-755h.jpg
silently modifies the files. Please check if this issue happens on
older files.
Regards
Yan, Zheng
>>
>> [root@server2]# sum /cephfs/webdata/static/456/JHL/66448H-755h.jpg
>> 00000 53
>
> So can we presume there are no file contents, and it's just 53 blocks of zeros?
>
> This doesn't sound familiar to me; Zheng, do you have any ideas?
> Anyway, ceph-fuse shouldn't be susceptible to this bug even with the
> page cache enabled; if you're just serving stuff via the web it's
> probably a better idea anyway (harder to break, easier to update,
> etc).
> -Greg
>
>>
>> The below shows the checksum on a working node.
>>
>> [root@server1]# ls -al /cephfs/webdata/static/456/JHL/66448H-755h.jpg
>> -rw-r--r-- 1 apache apache 53317 Aug 28 23:46
>> /cephfs/webdata/static/456/JHL/66448H-755h.jpg
>>
>> [root@server1]# sum /cephfs/webdata/static/456/JHL/66448H-755h.jpg
>> 03620 53
>> [root@server1]#
>>
>> If I flush the cache as shown below the checksum returns as expected and the
>> web server serves up valid content.
>>
>> [root@server2]# echo 3 > /proc/sys/vm/drop_caches
>> [root@server2]# sum /cephfs/webdata/static/456/JHL/66448H-755h.jpg
>> 03620 53
>>
>> After some time typically less than 1hr the issue repeats, It seems to not
>> repeat if I take any one of the servers out of the LB and only serve
>> requests from one of the servers.
>>
>> I may try and use the FUSE client has has a mount option direct_io that
>> looks to disable page cache.
>>
>> I have been hunting in the ML and tracker but could not see anything really
>> close to this issue, Any input or feedback on similar experiences is
>> welcome.
>>
>> Thanks
>>
>>
>> _______________________________________________
>> ceph-users mailing list
>> ceph-users@xxxxxxxxxxxxxx
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph. com
>>
> _______________________________________________
> ceph-users mailing list
> ceph-users@xxxxxxxxxxxxxx
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph. com
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com