Hi,
Is there any other way excpet rebooting the server when the client hangs?
If the server is in production environment, I can't restart it everytime.
Webert de Souza Lima <webert.boss@xxxxxxxxx> 于2018年8月8日周三 下午10:33写道:
Hi Zhenshi,if you still have the client mount hanging but no session is connected, you probably have some PID waiting with blocked IO from cephfs mount.
I face that now and then and the only solution is to reboot the server, as you won't be able to kill a process with pending IO.Regards,Webert LimaDevOps Engineer at MAV TecnologiaBelo Horizonte - BrasilIRC NICK - WebertRLZOn Wed, Aug 8, 2018 at 11:17 AM Zhenshi Zhou <deaderzzs@xxxxxxxxx> wrote:Hi Webert,That command shows the current sessions, whereas the server which I get the files(osdc,mdsc,monc) disconnect for a long time.So I cannot get useful infomation from the command you provide.ThanksWebert de Souza Lima <webert.boss@xxxxxxxxx> 于2018年8月8日周三 下午10:10写道:You could also see open sessions at the MDS server by issuing `ceph daemon mds.XX session ls`Regards,Webert LimaDevOps Engineer at MAV TecnologiaBelo Horizonte - BrasilIRC NICK - WebertRLZOn Wed, Aug 8, 2018 at 5:08 AM Zhenshi Zhou <deaderzzs@xxxxxxxxx> wrote:Hi, I find an old server which mounted cephfs and has the debug files.# cat osdcREQUESTS 0 homeless 0LINGER REQUESTSBACKOFFS# cat monchave monmap 2 want 3+have osdmap 3507have fsmap.user 0have mdsmap 55 want 56+fs_cluster_id -1# cat mdsc194 mds0 getattr #10000036ae3What does it mean?_______________________________________________Zhenshi Zhou <deaderzzs@xxxxxxxxx> 于2018年8月8日周三 下午1:58写道:I restarted the client server so that there's no file in that directory. I will take care of it if the client hangs next time.ThanksYan, Zheng <ukernel@xxxxxxxxx> 于2018年8月8日周三 上午11:23写道:On Wed, Aug 8, 2018 at 11:02 AM Zhenshi Zhou <deaderzzs@xxxxxxxxx> wrote:
>
> Hi,
> I check all my ceph servers and they are not mount cephfs on each of them(maybe I umount after testing). As a result, the cluster didn't encounter a memory deadlock. Besides, I check the monitoring system and the memory and cpu usage were at common level while the clients hung.
> Back to my question, there must be something else cause the client hang.
>
Check if there are hang requests in /sys/kernel/debug/ceph/xxxx/{osdc,mdsc},
> Zhenshi Zhou <deaderzzs@xxxxxxxxx> 于2018年8月8日周三 上午4:16写道:
>>
>> Hi, I'm not sure if it just mounts the cephfs without using or doing any operation within the mounted directory would be affected by flushing cache. I mounted cephfs on osd servers only for testing and then left it there. Anyway I will umount it.
>>
>> Thanks
>>
>> John Spray <jspray@xxxxxxxxxx>于2018年8月8日 周三03:37写道:
>>>
>>> On Tue, Aug 7, 2018 at 5:42 PM Reed Dier <reed.dier@xxxxxxxxxxx> wrote:
>>> >
>>> > This is the first I am hearing about this as well.
>>>
>>> This is not a Ceph-specific thing -- it can also affect similar
>>> systems like Lustre.
>>>
>>> The classic case is when under some memory pressure, the kernel tries
>>> to free memory by flushing the client's page cache, but doing the
>>> flush means allocating more memory on the server, making the memory
>>> pressure worse, until the whole thing just seizes up.
>>>
>>> John
>>>
>>> > Granted, I am using ceph-fuse rather than the kernel client at this point, but that isn’t etched in stone.
>>> >
>>> > Curious if there is more to share.
>>> >
>>> > Reed
>>> >
>>> > On Aug 7, 2018, at 9:47 AM, Webert de Souza Lima <webert.boss@xxxxxxxxx> wrote:
>>> >
>>> >
>>> > Yan, Zheng <ukernel@xxxxxxxxx> 于2018年8月7日周二 下午7:51写道:
>>> >>
>>> >> On Tue, Aug 7, 2018 at 7:15 PM Zhenshi Zhou <deaderzzs@xxxxxxxxx> wrote:
>>> >> this can cause memory deadlock. you should avoid doing this
>>> >>
>>> >> > Yan, Zheng <ukernel@xxxxxxxxx>于2018年8月7日 周二19:12写道:
>>> >> >>
>>> >> >> did you mount cephfs on the same machines that run ceph-osd?
>>> >> >>
>>> >
>>> >
>>> > I didn't know about this. I run this setup in production. :P
>>> >
>>> > Regards,
>>> >
>>> > Webert Lima
>>> > DevOps Engineer at MAV Tecnologia
>>> > Belo Horizonte - Brasil
>>> > IRC NICK - WebertRLZ
>>> >
>>> > _______________________________________________
>>> > ceph-users mailing list
>>> > ceph-users@xxxxxxxxxxxxxx
>>> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>> >
>>> >
>>> > _______________________________________________
>>> > ceph-users mailing list
>>> > ceph-users@xxxxxxxxxxxxxx
>>> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>> _______________________________________________
>>> ceph-users mailing list
>>> ceph-users@xxxxxxxxxxxxxx
>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
> _______________________________________________
> ceph-users mailing list
> ceph-users@xxxxxxxxxxxxxx
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com