Re: Inconsistent metadata seen by CephFS-fuse clients

"Yan, Zheng" <ukernel@xxxxxxxxx> · Sat, 28 Apr 2018 12:56:38 +0800



On Sat, Apr 28, 2018 at 10:25 AM, Oliver Freyermuth
<freyermuth@xxxxxxxxxxxxxxxxxx> wrote:
> Am 28.04.2018 um 03:55 schrieb Yan, Zheng:
>> On Fri, Apr 27, 2018 at 11:49 PM, Oliver Freyermuth
>> <freyermuth@xxxxxxxxxxxxxxxxxx> wrote:
>>> Dear Yan Zheng,
>>>
>>> Am 27.04.2018 um 15:32 schrieb Yan, Zheng:
>>>> On Fri, Apr 27, 2018 at 7:10 PM, Oliver Freyermuth
>>>> <freyermuth@xxxxxxxxxxxxxxxxxx> wrote:
>>>>> Dear Yan Zheng,
>>>>>
>>>>> Am 27.04.2018 um 02:58 schrieb Yan, Zheng:
>>>>>> On Thu, Apr 26, 2018 at 10:00 PM, Oliver Freyermuth
>>>>>> <freyermuth@xxxxxxxxxxxxxxxxxx> wrote:
>>>>>>> Dear Cephalopodians,
>>>>>>>
>>>>>>> just now that our Ceph cluster is under high I/O load, we get user reports of files not being seen on some clients,
>>>>>>> but somehow showing up after forcing a stat() syscall.
>>>>>>>
>>>>>>> For example, one user had added several files to a directory via an NFS client attached to nfs-ganesha (which uses libcephfs),
>>>>>>> and afterwards, all other nfs-ganesha servers saw it, and 44 of our Fuse-clients -
>>>>>>> but one single client still saw the old contents of the directory, i.e. the files seemed missing(!).
>>>>>>> This happened both when using "ls" on the directory or when trying to access the non-existent files directly.
>>>>>>>
>>>>>>> I could confirm this observation also in a fresh login shell on the machine.
>>>>>>>
>>>>>>> Then, on the "broken" client, I entered in the directory which seemed to contain only the "old" content, and I created a new file in there.
>>>>>>> This worked fine, and all other clients saw the file immediately.
>>>>>>> Also on the broken client, metadata was now updated and all other files appeared - i.e. everything was "in sync" again.
>>>>>>>
>>>>>>> There's nothing in the ceph-logs of our MDS, or in the syslogs of the client machine / MDS.
>>>>>>>
>>>>>>>
>>>>>>> Another user observed the same, but not explicitly limited to one machine (it seems random).
>>>>>>> He now uses a "stat" on the file he expects to exist (but which is not seen with "ls").
>>>>>>> The stat returns "No such file", a subsequent "ls" then however lists the file, and it can be accessed normally.
>>>>>>>
>>>>>>> This feels like something is messed up concerning the client caps - these are all 12.2.4 Fuse clients.
>>>>>>>
>>>>>>> Any ideas how to find the cause?
>>>>>>> It only happens since recently, and under high I/O load with many metadata operations.
>>>>>>>
>>>>>>
>>>>>> Sounds like bug in readdir cache. Could you try the attached patch.
>>>>>
>>>>> Many thanks for the quick response and patch!
>>>>> The problem is to try it out. We only observe this issue on our production cluster, randomly, especially during high load, and only after is has been running for a few days.
>>>>> We don't have a test Ceph cluster available of similar size and with similar load. I would not like to try out the patch on our production system.
>>>>>
>>>>> Can you extrapolate from the bugfix / patch what's the minimal setup needed to reproduce / trigger the issue?
>>>>> Then we may look into setting up a minimal test setup to check whether the issue is resolved.
>>>>>
>>>>> All the best and many thanks,
>>>>>         Oliver
>>>>>
>>>>
>>>> I think this is libcephfs version of
>>>> http://tracker.ceph.com/issues/20467. I forgot to write patch for
>>>> libcephfs, Sorry. To reproduce this,  write a program that call
>>>> getdents(2) in a loop. Add artificially delay to the loop, make the
>>>> program iterates whole directory in about ten seconds. Run several
>>>> instance of the program simultaneously on a large directory. Also make
>>>> client_cache_size a little smaller than the size of directory.
>>>
>>> This is strange - in case 1 where our users observed the issue,
>>> the affected directory contained exactly 1 file, which some clients saw and others did not.
>>> In case 2, the affected directory contained about 5 files only.
>>>
>>> Of course, we also have directories with many (thousands) of files in our CephFS, and they may be accessed in parallel.
>>> Also, we run a massive number of parallel programs (about 2000) accessing the FS via about 40 clients.
>>>
>>> 1. Could this still be the same issue?
>>> 2. Many thanks for the repro-instructions. It seems, however, this would require quite an amount of time,
>>>    since we don't have a separate "test" instance at hand (yet) and are not experts on the field.
>>>    We could try, but it won't be fast... And meybe it's nicer to have something like this in the test suite, if possible.
>>>
>>> Potentially, it's even faster to get the fix in the next patch release, if it's clear this can not have bad side effects.
>>>
>>> Also, should we transfer this information to a ticket?
>>>
>>> Cheers and many thanks,
>>>         Oliver
>>>
>>
>> I found an issue in the code that handle session stale message. Steps
>> to reproduce are at http://tracker.ceph.com/issues/23894.
>
> Thanks, yes, this seems a lot more likely to be our issue - especially, those directories with not so many files were indeed not very old,
> so the issue could indeed have been present since directory creation!
> Also, the problem only appeared once there was high load on all clients, i.e. those directories were most likely created when some clients were unresponsive.
> I don't find any "state client" messages in the logs, but since the clients came back and weren'd evicted, that's also not logged, I think.
>
> So many thanks for diagnosing this!
>

Fix for this bug https://github.com/ceph/ceph/pull/21712


> All the best,
> Oliver
>
>>
>> Regards
>> Yan, Zheng
>>
>>>>
>>>> Regards
>>>> Yan, Zheng
>>>>
>>>>>
>>>>>>
>>>>>> Regards
>>>>>> Yan, Zheng
>>>>>>
>>>>>>
>>>>>>> Cheers,
>>>>>>>         Oliver
>>>>>>>
>>>>>>>
>>>>>>
>>>>>>> _______________________________________________
>>>>>>> ceph-users mailing list
>>>>>>> ceph-users@xxxxxxxxxxxxxx
>>>>>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>>>>>>
>>>>>
>>>
>>>
>
>
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com