Re: Process stuck in D+ on cephfs mount

"Marc Roos" <M.Roos@xxxxxxxxxxxxxxxxx> · Wed, 23 Jan 2019 19:26:14 +0100

Are there any others I need to grab? So can I do all at once. I do not
like to have to restart this one so often.

>
> Yes sort of. I do have an inconsistent pg for a while, but it is on a 
> different pool. But I take it this is related to a networking issue I 
> currently have with rsync and broken pipe.
>
> Where exactly does it go wrong? The cephfs kernel clients is sending a 

> request to the osd, but the osd never replies?
>
yes,   please check if there are hang requests in
/sys/kernel/debug/ceph/xxx/osdc

>  >
>  >
>  >>
>  >>
>  >> I got one again
>  >>
>  >> [<ffffffff81183503>] wait_on_page_bit_killable+0x83/0xa0
>  >> [<ffffffff811835d2>] __lock_page_or_retry+0xb2/0xc0  >> 
> [<ffffffff81183997>] filemap_fault+0x3b7/0x410  >> 
> [<ffffffffa055ce9c>] ceph_filemap_fault+0x13c/0x310 [ceph]  >> 
> [<ffffffff811ac84c>] __do_fault+0x4c/0xc0  >> [<ffffffff811acce3>] 
> do_read_fault.isra.42+0x43/0x130  >> [<ffffffff811b1471>] 
> handle_mm_fault+0x6b1/0x1040  >> [<ffffffff81692c04>] 
> __do_page_fault+0x154/0x450  >> [<ffffffff81692f35>] 
> do_page_fault+0x35/0x90  >> [<ffffffff8168f148>] page_fault+0x28/0x30  

> >> [<ffffffffffffffff>] 0xffffffffffffffff  >>  >>  >  >This is likely 

> caused by hang osd request,  was you cluster health?
>  >
>  >
>  >>  >check /proc/<stuck process>/stack to find where it is stuck  >>  

> >  >>  >>  >>  >>  >>  >> I have a process stuck in D+ writing to 
> cephfs kernel mount.
>  >> Anything
>  >>  >> can be done about this? (without rebooting)  >>  >>  >>  >>  
> >>  >> CentOS Linux release 7.5.1804 (Core)  >>  >> Linux 
> 3.10.0-514.21.2.el7.x86_64  >>  >>  >>  >>

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com