try 'mount -f', recent kernel should handle 'mount -f' pretty well On Wed, Aug 8, 2018 at 10:46 PM Zhenshi Zhou <deaderzzs@xxxxxxxxx> wrote: > > Hi, > Is there any other way excpet rebooting the server when the client hangs? > If the server is in production environment, I can't restart it everytime. > > Webert de Souza Lima <webert.boss@xxxxxxxxx> 于2018年8月8日周三 下午10:33写道: >> >> Hi Zhenshi, >> >> if you still have the client mount hanging but no session is connected, you probably have some PID waiting with blocked IO from cephfs mount. >> I face that now and then and the only solution is to reboot the server, as you won't be able to kill a process with pending IO. >> >> Regards, >> >> Webert Lima >> DevOps Engineer at MAV Tecnologia >> Belo Horizonte - Brasil >> IRC NICK - WebertRLZ >> >> >> On Wed, Aug 8, 2018 at 11:17 AM Zhenshi Zhou <deaderzzs@xxxxxxxxx> wrote: >>> >>> Hi Webert, >>> That command shows the current sessions, whereas the server which I get the files(osdc,mdsc,monc) disconnect for a long time. >>> So I cannot get useful infomation from the command you provide. >>> >>> Thanks >>> >>> Webert de Souza Lima <webert.boss@xxxxxxxxx> 于2018年8月8日周三 下午10:10写道: >>>> >>>> You could also see open sessions at the MDS server by issuing `ceph daemon mds.XX session ls` >>>> >>>> Regards, >>>> >>>> Webert Lima >>>> DevOps Engineer at MAV Tecnologia >>>> Belo Horizonte - Brasil >>>> IRC NICK - WebertRLZ >>>> >>>> >>>> On Wed, Aug 8, 2018 at 5:08 AM Zhenshi Zhou <deaderzzs@xxxxxxxxx> wrote: >>>>> >>>>> Hi, I find an old server which mounted cephfs and has the debug files. >>>>> # cat osdc >>>>> REQUESTS 0 homeless 0 >>>>> LINGER REQUESTS >>>>> BACKOFFS >>>>> # cat monc >>>>> have monmap 2 want 3+ >>>>> have osdmap 3507 >>>>> have fsmap.user 0 >>>>> have mdsmap 55 want 56+ >>>>> fs_cluster_id -1 >>>>> # cat mdsc >>>>> 194 mds0 getattr #10000036ae3 >>>>> >>>>> What does it mean? >>>>> >>>>> Zhenshi Zhou <deaderzzs@xxxxxxxxx> 于2018年8月8日周三 下午1:58写道: >>>>>> >>>>>> I restarted the client server so that there's no file in that directory. I will take care of it if the client hangs next time. >>>>>> >>>>>> Thanks >>>>>> >>>>>> Yan, Zheng <ukernel@xxxxxxxxx> 于2018年8月8日周三 上午11:23写道: >>>>>>> >>>>>>> On Wed, Aug 8, 2018 at 11:02 AM Zhenshi Zhou <deaderzzs@xxxxxxxxx> wrote: >>>>>>> > >>>>>>> > Hi, >>>>>>> > I check all my ceph servers and they are not mount cephfs on each of them(maybe I umount after testing). As a result, the cluster didn't encounter a memory deadlock. Besides, I check the monitoring system and the memory and cpu usage were at common level while the clients hung. >>>>>>> > Back to my question, there must be something else cause the client hang. >>>>>>> > >>>>>>> >>>>>>> Check if there are hang requests in /sys/kernel/debug/ceph/xxxx/{osdc,mdsc}, >>>>>>> >>>>>>> > Zhenshi Zhou <deaderzzs@xxxxxxxxx> 于2018年8月8日周三 上午4:16写道: >>>>>>> >> >>>>>>> >> Hi, I'm not sure if it just mounts the cephfs without using or doing any operation within the mounted directory would be affected by flushing cache. I mounted cephfs on osd servers only for testing and then left it there. Anyway I will umount it. >>>>>>> >> >>>>>>> >> Thanks >>>>>>> >> >>>>>>> >> John Spray <jspray@xxxxxxxxxx>于2018年8月8日 周三03:37写道: >>>>>>> >>> >>>>>>> >>> On Tue, Aug 7, 2018 at 5:42 PM Reed Dier <reed.dier@xxxxxxxxxxx> wrote: >>>>>>> >>> > >>>>>>> >>> > This is the first I am hearing about this as well. >>>>>>> >>> >>>>>>> >>> This is not a Ceph-specific thing -- it can also affect similar >>>>>>> >>> systems like Lustre. >>>>>>> >>> >>>>>>> >>> The classic case is when under some memory pressure, the kernel tries >>>>>>> >>> to free memory by flushing the client's page cache, but doing the >>>>>>> >>> flush means allocating more memory on the server, making the memory >>>>>>> >>> pressure worse, until the whole thing just seizes up. >>>>>>> >>> >>>>>>> >>> John >>>>>>> >>> >>>>>>> >>> > Granted, I am using ceph-fuse rather than the kernel client at this point, but that isn’t etched in stone. >>>>>>> >>> > >>>>>>> >>> > Curious if there is more to share. >>>>>>> >>> > >>>>>>> >>> > Reed >>>>>>> >>> > >>>>>>> >>> > On Aug 7, 2018, at 9:47 AM, Webert de Souza Lima <webert.boss@xxxxxxxxx> wrote: >>>>>>> >>> > >>>>>>> >>> > >>>>>>> >>> > Yan, Zheng <ukernel@xxxxxxxxx> 于2018年8月7日周二 下午7:51写道: >>>>>>> >>> >> >>>>>>> >>> >> On Tue, Aug 7, 2018 at 7:15 PM Zhenshi Zhou <deaderzzs@xxxxxxxxx> wrote: >>>>>>> >>> >> this can cause memory deadlock. you should avoid doing this >>>>>>> >>> >> >>>>>>> >>> >> > Yan, Zheng <ukernel@xxxxxxxxx>于2018年8月7日 周二19:12写道: >>>>>>> >>> >> >> >>>>>>> >>> >> >> did you mount cephfs on the same machines that run ceph-osd? >>>>>>> >>> >> >> >>>>>>> >>> > >>>>>>> >>> > >>>>>>> >>> > I didn't know about this. I run this setup in production. :P >>>>>>> >>> > >>>>>>> >>> > Regards, >>>>>>> >>> > >>>>>>> >>> > Webert Lima >>>>>>> >>> > DevOps Engineer at MAV Tecnologia >>>>>>> >>> > Belo Horizonte - Brasil >>>>>>> >>> > IRC NICK - WebertRLZ >>>>>>> >>> > >>>>>>> >>> > _______________________________________________ >>>>>>> >>> > ceph-users mailing list >>>>>>> >>> > ceph-users@xxxxxxxxxxxxxx >>>>>>> >>> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >>>>>>> >>> > >>>>>>> >>> > >>>>>>> >>> > _______________________________________________ >>>>>>> >>> > ceph-users mailing list >>>>>>> >>> > ceph-users@xxxxxxxxxxxxxx >>>>>>> >>> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >>>>>>> >>> _______________________________________________ >>>>>>> >>> ceph-users mailing list >>>>>>> >>> ceph-users@xxxxxxxxxxxxxx >>>>>>> >>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >>>>>>> > >>>>>>> > _______________________________________________ >>>>>>> > ceph-users mailing list >>>>>>> > ceph-users@xxxxxxxxxxxxxx >>>>>>> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >>>>> >>>>> _______________________________________________ >>>>> ceph-users mailing list >>>>> ceph-users@xxxxxxxxxxxxxx >>>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com