Re: mds "laggy"

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Varun,

What version of Ceph are you running? Can you confirm that the MDS daemon (ceph-mds) is still running or has crashed when the MDS becomes laggy/unresponsive? If it has crashed checked the MDS log for a crash report. There were a couple Hadoop workloads that caused the MDS to misbehave for us as well..

-Noah

On Apr 24, 2013, at 12:56 AM, Varun Chandramouli <varun.c37@xxxxxxxxx> wrote:

> Hi All,
> 
> I am running the MapReduce wordcount code (on a ceph cluster consisting of 2 VMs) on a data set consisting of 5000 odd files (approx. 10gb size in total). Periodically, the ceph health says that the mds is laggy/unresponsive, and I get messages like the following:
> 
> 13/04/24 10:41:00 INFO mapred.JobClient:  map 11% reduce 3%
> 13/04/24 10:42:36 INFO mapred.JobClient:  map 12% reduce 3%
> 13/04/24 10:42:45 INFO mapred.JobClient:  map 12% reduce 4%
> 13/04/24 10:44:08 INFO mapred.JobClient:  map 13% reduce 4%
> 13/04/24 10:45:29 INFO mapred.JobClient:  map 14% reduce 4%
> 13/04/24 11:06:31 INFO mapred.JobClient: Task Id : attempt_201304241023_0001_m_000706_0, Status : FAILED
> Task attempt_201304241023_0001_m_000706_0 failed to report status for 600 seconds. Killing!
> Task attempt_201304241023_0001_m_000706_0 failed to report status for 600 seconds. Killing!
> 
> I then have to manually restart the mds again, and the process continues execution. Can someone please tell me the reason for this, and how to solve it? Pasting my ceph.conf file below:
> 
> [global]
>         auth client required = none
>         auth cluster required = none
>         auth service required = none
> 
> [osd]
>         osd journal data = 1000
>         filestore xattr use omap = true
> #       osd data = /var/lib/ceph/osd/ceph-$id
> 
> [mon.a]
>         host = varunc4-virtual-machine
>         mon addr = 10.72.148.209:6789
> #       mon data = /var/lib/ceph/mon/ceph-a
> 
> [mds.a]
>         host = varunc4-virtual-machine
> #       mds data = /var/lib/ceph/mds/ceph-a
> 
> [osd.0]
>         host = varunc4-virtual-machine
> 
> [osd.1]
>         host = varunc5-virtual-machine
> 
> Regards
> Varun 
> _______________________________________________
> ceph-users mailing list
> ceph-users@xxxxxxxxxxxxxx
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux