Re: Faulting MDS clients, HEALTH_OK

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



I’ll see if I can capture the output the next time this issue arises, but in general the output looks as if nothing is wrong. No OSD are down, a ‘ceph health detail’ results in HEALTH_OK, the mds server is in the up:active state, in general it’s as if nothing is wrong server side (at least from the summary).

-Chris

On 9/21/16, 10:46 AM, "Gregory Farnum" <gfarnum@xxxxxxxxxx> wrote:

    On Wed, Sep 21, 2016 at 6:30 AM, Heller, Chris <cheller@xxxxxxxxxx> wrote:
    > I’m running a production 0.94.7 Ceph cluster, and have been seeing a
    > periodic issue arise where in all my MDS clients will become stuck, and the
    > fix so far has been to restart the active MDS (sometimes I need to restart
    > the subsequent active MDS as well).
    >
    >
    >
    > These clients are using the cephfs-hadoop API, so there is no kernel client,
    > or fuse api involved. When I see clients get stuck, there are messages
    > printed to stderr like the following:
    >
    >
    >
    > 2016-09-21 10:31:12.285030 7fea4c7fb700  0 – 192.168.1.241:0/1606648601 >>
    > 192.168.1.195:6801/1674 pipe(0x7feaa0a1e0f0 sd=206 :0 s=1 pgs=0 cs=0 l=0
    > c=0x7feaa0a0c500).fault
    >
    >
    >
    > I’m at somewhat of a loss on where to begin debugging this issue, and wanted
    > to ping the list for ideas.
    
    What's the full output of "ceph -s" when this happens? Have you looked
    at the MDS' admin socket's ops-in-flight, and that of the clients?j
    
    http://docs.ceph.com/docs/master/cephfs/troubleshooting/ may help some as well.
    
    >
    >
    >
    > I managed to dump the mds cache during one of the stalled moments, which
    > hopefully is a useful starting point:
    >
    >
    >
    > e51bed37327a676e9974d740a13e173f11d1a11fdba5fbcf963b62023b06d7e8
    > mdscachedump.txt.gz (https://filetea.me/t1sz3XPHxEVThOk8tvVTK5Bsg)
    >
    >
    >
    >
    >
    > -Chris
    >
    >
    >
    >
    > _______________________________________________
    > ceph-users mailing list
    > ceph-users@xxxxxxxxxxxxxx
    > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
    >
    

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux