Re: cephfs kernel client hangs

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



try 'mount -f', recent kernel should handle 'mount -f' pretty well
On Wed, Aug 8, 2018 at 10:46 PM Zhenshi Zhou <deaderzzs@xxxxxxxxx> wrote:
>
> Hi,
> Is there any other way excpet rebooting the server when the client hangs?
> If the server is in production environment, I can't restart it everytime.
>
> Webert de Souza Lima <webert.boss@xxxxxxxxx> 于2018年8月8日周三 下午10:33写道:
>>
>> Hi Zhenshi,
>>
>> if you still have the client mount hanging but no session is connected, you probably have some PID waiting with blocked IO from cephfs mount.
>> I face that now and then and the only solution is to reboot the server, as you won't be able to kill a process with pending IO.
>>
>> Regards,
>>
>> Webert Lima
>> DevOps Engineer at MAV Tecnologia
>> Belo Horizonte - Brasil
>> IRC NICK - WebertRLZ
>>
>>
>> On Wed, Aug 8, 2018 at 11:17 AM Zhenshi Zhou <deaderzzs@xxxxxxxxx> wrote:
>>>
>>> Hi Webert,
>>> That command shows the current sessions, whereas the server which I get the files(osdc,mdsc,monc) disconnect for a long time.
>>> So I cannot get useful infomation from the command you provide.
>>>
>>> Thanks
>>>
>>> Webert de Souza Lima <webert.boss@xxxxxxxxx> 于2018年8月8日周三 下午10:10写道:
>>>>
>>>> You could also see open sessions at the MDS server by issuing  `ceph daemon mds.XX session ls`
>>>>
>>>> Regards,
>>>>
>>>> Webert Lima
>>>> DevOps Engineer at MAV Tecnologia
>>>> Belo Horizonte - Brasil
>>>> IRC NICK - WebertRLZ
>>>>
>>>>
>>>> On Wed, Aug 8, 2018 at 5:08 AM Zhenshi Zhou <deaderzzs@xxxxxxxxx> wrote:
>>>>>
>>>>> Hi, I find an old server which mounted cephfs and has the debug files.
>>>>> # cat osdc
>>>>> REQUESTS 0 homeless 0
>>>>> LINGER REQUESTS
>>>>> BACKOFFS
>>>>> # cat monc
>>>>> have monmap 2 want 3+
>>>>> have osdmap 3507
>>>>> have fsmap.user 0
>>>>> have mdsmap 55 want 56+
>>>>> fs_cluster_id -1
>>>>> # cat mdsc
>>>>> 194     mds0    getattr  #10000036ae3
>>>>>
>>>>> What does it mean?
>>>>>
>>>>> Zhenshi Zhou <deaderzzs@xxxxxxxxx> 于2018年8月8日周三 下午1:58写道:
>>>>>>
>>>>>> I restarted the client server so that there's no file in that directory. I will take care of it if the client hangs next time.
>>>>>>
>>>>>> Thanks
>>>>>>
>>>>>> Yan, Zheng <ukernel@xxxxxxxxx> 于2018年8月8日周三 上午11:23写道:
>>>>>>>
>>>>>>> On Wed, Aug 8, 2018 at 11:02 AM Zhenshi Zhou <deaderzzs@xxxxxxxxx> wrote:
>>>>>>> >
>>>>>>> > Hi,
>>>>>>> > I check all my ceph servers and they are not mount cephfs on each of them(maybe I umount after testing). As a result, the cluster didn't encounter a memory deadlock. Besides, I check the monitoring system and the memory and cpu usage were at common level while the clients hung.
>>>>>>> > Back to my question, there must be something else cause the client hang.
>>>>>>> >
>>>>>>>
>>>>>>> Check if there are hang requests in /sys/kernel/debug/ceph/xxxx/{osdc,mdsc},
>>>>>>>
>>>>>>> > Zhenshi Zhou <deaderzzs@xxxxxxxxx> 于2018年8月8日周三 上午4:16写道:
>>>>>>> >>
>>>>>>> >> Hi, I'm not sure if it just mounts the cephfs without using or doing any operation within the mounted directory would be affected by flushing cache. I mounted cephfs on osd servers only for testing and then left it there. Anyway I will umount it.
>>>>>>> >>
>>>>>>> >> Thanks
>>>>>>> >>
>>>>>>> >> John Spray <jspray@xxxxxxxxxx>于2018年8月8日 周三03:37写道:
>>>>>>> >>>
>>>>>>> >>> On Tue, Aug 7, 2018 at 5:42 PM Reed Dier <reed.dier@xxxxxxxxxxx> wrote:
>>>>>>> >>> >
>>>>>>> >>> > This is the first I am hearing about this as well.
>>>>>>> >>>
>>>>>>> >>> This is not a Ceph-specific thing -- it can also affect similar
>>>>>>> >>> systems like Lustre.
>>>>>>> >>>
>>>>>>> >>> The classic case is when under some memory pressure, the kernel tries
>>>>>>> >>> to free memory by flushing the client's page cache, but doing the
>>>>>>> >>> flush means allocating more memory on the server, making the memory
>>>>>>> >>> pressure worse, until the whole thing just seizes up.
>>>>>>> >>>
>>>>>>> >>> John
>>>>>>> >>>
>>>>>>> >>> > Granted, I am using ceph-fuse rather than the kernel client at this point, but that isn’t etched in stone.
>>>>>>> >>> >
>>>>>>> >>> > Curious if there is more to share.
>>>>>>> >>> >
>>>>>>> >>> > Reed
>>>>>>> >>> >
>>>>>>> >>> > On Aug 7, 2018, at 9:47 AM, Webert de Souza Lima <webert.boss@xxxxxxxxx> wrote:
>>>>>>> >>> >
>>>>>>> >>> >
>>>>>>> >>> > Yan, Zheng <ukernel@xxxxxxxxx> 于2018年8月7日周二 下午7:51写道:
>>>>>>> >>> >>
>>>>>>> >>> >> On Tue, Aug 7, 2018 at 7:15 PM Zhenshi Zhou <deaderzzs@xxxxxxxxx> wrote:
>>>>>>> >>> >> this can cause memory deadlock. you should avoid doing this
>>>>>>> >>> >>
>>>>>>> >>> >> > Yan, Zheng <ukernel@xxxxxxxxx>于2018年8月7日 周二19:12写道:
>>>>>>> >>> >> >>
>>>>>>> >>> >> >> did you mount cephfs on the same machines that run ceph-osd?
>>>>>>> >>> >> >>
>>>>>>> >>> >
>>>>>>> >>> >
>>>>>>> >>> > I didn't know about this. I run this setup in production. :P
>>>>>>> >>> >
>>>>>>> >>> > Regards,
>>>>>>> >>> >
>>>>>>> >>> > Webert Lima
>>>>>>> >>> > DevOps Engineer at MAV Tecnologia
>>>>>>> >>> > Belo Horizonte - Brasil
>>>>>>> >>> > IRC NICK - WebertRLZ
>>>>>>> >>> >
>>>>>>> >>> > _______________________________________________
>>>>>>> >>> > ceph-users mailing list
>>>>>>> >>> > ceph-users@xxxxxxxxxxxxxx
>>>>>>> >>> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>>>>>> >>> >
>>>>>>> >>> >
>>>>>>> >>> > _______________________________________________
>>>>>>> >>> > ceph-users mailing list
>>>>>>> >>> > ceph-users@xxxxxxxxxxxxxx
>>>>>>> >>> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>>>>>> >>> _______________________________________________
>>>>>>> >>> ceph-users mailing list
>>>>>>> >>> ceph-users@xxxxxxxxxxxxxx
>>>>>>> >>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>>>>>> >
>>>>>>> > _______________________________________________
>>>>>>> > ceph-users mailing list
>>>>>>> > ceph-users@xxxxxxxxxxxxxx
>>>>>>> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>>>>
>>>>> _______________________________________________
>>>>> ceph-users mailing list
>>>>> ceph-users@xxxxxxxxxxxxxx
>>>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux