Which, looks to be in a tight loop in the memory model _sample… (gdb) bt #0 0x00007f0270d84d2d in read () from /lib/x86_64-linux-gnu/libpthread.so.0 #1 0x00007f027046dd88 in std::__basic_file<char>::xsgetn(char*, long) () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6 #2 0x00007f027046f4c5 in std::basic_filebuf<char, std::char_traits<char> >::underflow() () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6 #3 0x00007f0270467ceb in std::basic_istream<char, std::char_traits<char> >& std::getline<char, std::char_traits<char>, std::allocator<char> >(std::basic_istream<char, std::char_traits<char> >&, std::basic_string<char, std::char_traits<char>, std::allocator<char> >&, char) () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6 #4 0x000000000072bdd4 in MemoryModel::_sample(MemoryModel::snap*) () #5 0x00000000005658db in MDCache::check_memory_usage() () #6 0x00000000004ba929 in MDS::tick() () #7 0x0000000000794c65 in SafeTimer::timer_thread() () #8 0x00000000007958ad in SafeTimerThread::entry() () #9 0x00007f0270d7de9a in start_thread () from /lib/x86_64-linux-gnu/libpthread.so.0 On Mar 6, 2013, at 6:18 PM, Noah Watkins <jayhawk@xxxxxxxxxxx> wrote: > > On Mar 6, 2013, at 5:57 PM, Noah Watkins <jayhawk@xxxxxxxxxxx> wrote: > >> The MDS process in my cluster is running at 100% CPU. In fact I thought the cluster came down, but rather an ls was taking a minute. There aren't any clients active. I've left the process running in case there is any probing you'd like to do on it: >> >> virt res cpu >> 4629m 88m 5260 S 92 1.1 113:32.79 ceph-mds >> >> Thanks, >> Noah >> > > > This is a ceph-mds child thread under strace. The only thread > that appears to be doing anything. > > root@issdm-44:/home/hadoop/hadoop-common# strace -p 3372 > Process 3372 attached - interrupt to quit > read(1649, "7f0203235000-7f0203236000 ---p 0"..., 8191) = 4050 > read(1649, "7f0205053000-7f0205054000 ---p 0"..., 8191) = 4050 > read(1649, "7f0206e71000-7f0206e72000 ---p 0"..., 8191) = 4050 > read(1649, "7f0214144000-7f0214244000 rw-p 0"..., 8191) = 4020 > read(1649, "7f0215f62000-7f0216062000 rw-p 0"..., 8191) = 4020 > read(1649, "7f0217d80000-7f0217e80000 rw-p 0"..., 8191) = 4020 > read(1649, "7f0219b9e000-7f0219c9e000 rw-p 0"..., 8191) = 4020 > ... > > That file looks to be: > > ceph-mds 3337 root 1649r REG 0,3 0 266903 /proc/3337/maps > > (3337 is the parent process). -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html