Re: Inconsistent metadata seen by CephFS-fuse clients

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Am 28.04.2018 um 03:55 schrieb Yan, Zheng:
> On Fri, Apr 27, 2018 at 11:49 PM, Oliver Freyermuth
> <freyermuth@xxxxxxxxxxxxxxxxxx> wrote:
>> Dear Yan Zheng,
>>
>> Am 27.04.2018 um 15:32 schrieb Yan, Zheng:
>>> On Fri, Apr 27, 2018 at 7:10 PM, Oliver Freyermuth
>>> <freyermuth@xxxxxxxxxxxxxxxxxx> wrote:
>>>> Dear Yan Zheng,
>>>>
>>>> Am 27.04.2018 um 02:58 schrieb Yan, Zheng:
>>>>> On Thu, Apr 26, 2018 at 10:00 PM, Oliver Freyermuth
>>>>> <freyermuth@xxxxxxxxxxxxxxxxxx> wrote:
>>>>>> Dear Cephalopodians,
>>>>>>
>>>>>> just now that our Ceph cluster is under high I/O load, we get user reports of files not being seen on some clients,
>>>>>> but somehow showing up after forcing a stat() syscall.
>>>>>>
>>>>>> For example, one user had added several files to a directory via an NFS client attached to nfs-ganesha (which uses libcephfs),
>>>>>> and afterwards, all other nfs-ganesha servers saw it, and 44 of our Fuse-clients -
>>>>>> but one single client still saw the old contents of the directory, i.e. the files seemed missing(!).
>>>>>> This happened both when using "ls" on the directory or when trying to access the non-existent files directly.
>>>>>>
>>>>>> I could confirm this observation also in a fresh login shell on the machine.
>>>>>>
>>>>>> Then, on the "broken" client, I entered in the directory which seemed to contain only the "old" content, and I created a new file in there.
>>>>>> This worked fine, and all other clients saw the file immediately.
>>>>>> Also on the broken client, metadata was now updated and all other files appeared - i.e. everything was "in sync" again.
>>>>>>
>>>>>> There's nothing in the ceph-logs of our MDS, or in the syslogs of the client machine / MDS.
>>>>>>
>>>>>>
>>>>>> Another user observed the same, but not explicitly limited to one machine (it seems random).
>>>>>> He now uses a "stat" on the file he expects to exist (but which is not seen with "ls").
>>>>>> The stat returns "No such file", a subsequent "ls" then however lists the file, and it can be accessed normally.
>>>>>>
>>>>>> This feels like something is messed up concerning the client caps - these are all 12.2.4 Fuse clients.
>>>>>>
>>>>>> Any ideas how to find the cause?
>>>>>> It only happens since recently, and under high I/O load with many metadata operations.
>>>>>>
>>>>>
>>>>> Sounds like bug in readdir cache. Could you try the attached patch.
>>>>
>>>> Many thanks for the quick response and patch!
>>>> The problem is to try it out. We only observe this issue on our production cluster, randomly, especially during high load, and only after is has been running for a few days.
>>>> We don't have a test Ceph cluster available of similar size and with similar load. I would not like to try out the patch on our production system.
>>>>
>>>> Can you extrapolate from the bugfix / patch what's the minimal setup needed to reproduce / trigger the issue?
>>>> Then we may look into setting up a minimal test setup to check whether the issue is resolved.
>>>>
>>>> All the best and many thanks,
>>>>         Oliver
>>>>
>>>
>>> I think this is libcephfs version of
>>> http://tracker.ceph.com/issues/20467. I forgot to write patch for
>>> libcephfs, Sorry. To reproduce this,  write a program that call
>>> getdents(2) in a loop. Add artificially delay to the loop, make the
>>> program iterates whole directory in about ten seconds. Run several
>>> instance of the program simultaneously on a large directory. Also make
>>> client_cache_size a little smaller than the size of directory.
>>
>> This is strange - in case 1 where our users observed the issue,
>> the affected directory contained exactly 1 file, which some clients saw and others did not.
>> In case 2, the affected directory contained about 5 files only.
>>
>> Of course, we also have directories with many (thousands) of files in our CephFS, and they may be accessed in parallel.
>> Also, we run a massive number of parallel programs (about 2000) accessing the FS via about 40 clients.
>>
>> 1. Could this still be the same issue?
>> 2. Many thanks for the repro-instructions. It seems, however, this would require quite an amount of time,
>>    since we don't have a separate "test" instance at hand (yet) and are not experts on the field.
>>    We could try, but it won't be fast... And meybe it's nicer to have something like this in the test suite, if possible.
>>
>> Potentially, it's even faster to get the fix in the next patch release, if it's clear this can not have bad side effects.
>>
>> Also, should we transfer this information to a ticket?
>>
>> Cheers and many thanks,
>>         Oliver
>>
> 
> I found an issue in the code that handle session stale message. Steps
> to reproduce are at http://tracker.ceph.com/issues/23894.

Thanks, yes, this seems a lot more likely to be our issue - especially, those directories with not so many files were indeed not very old,
so the issue could indeed have been present since directory creation! 
Also, the problem only appeared once there was high load on all clients, i.e. those directories were most likely created when some clients were unresponsive. 
I don't find any "state client" messages in the logs, but since the clients came back and weren'd evicted, that's also not logged, I think. 

So many thanks for diagnosing this!

All the best,
Oliver

> 
> Regards
> Yan, Zheng
> 
>>>
>>> Regards
>>> Yan, Zheng
>>>
>>>>
>>>>>
>>>>> Regards
>>>>> Yan, Zheng
>>>>>
>>>>>
>>>>>> Cheers,
>>>>>>         Oliver
>>>>>>
>>>>>>
>>>>>
>>>>>> _______________________________________________
>>>>>> ceph-users mailing list
>>>>>> ceph-users@xxxxxxxxxxxxxx
>>>>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>>>>>
>>>>
>>
>>


Attachment: smime.p7s
Description: S/MIME Cryptographic Signature

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux