"Too many open files not handled on operation 24" This is the reason. You need to increase the fd size limit. On Thu, May 15, 2014 at 6:06 PM, Cao, Buddy <buddy.cao at intel.com> wrote: > Hi, > > > > One of the osd in my cluster downs w no reason, I saw the error message in > the log below, I restarted osd, but after several hours, the problem come > back again. Could you help? > > > > ?Too many open files not handled on operation 24 (541468.0.1, or op 1, > counting from 0) > > -96> 2014-05-14 22:12:24.281185 7f617b33e700 5 -- op tracker -- , seq: > 788808, time: 2014-05-14 22:12:24.281164, event: reached_pg, request: > osd_op(client.21276.0:3884815 rb.0.31c7.238e1f 29.000000003c15 [write > 2273280~65536] 4.110fcf4 e12271) v4 > > -95> 2014-05-14 22:12:24.281192 7f618556d700 0 > filestore(/var/lib/ceph/osd/ceph-3) unexpected error code > > -94> 2014-05-14 22:12:24.281197 7f6181b4b700 5 -- op tracker -- , seq: > 788843, time: 2014-05-14 22:12:24.281011, event: header_read, request: > > osd_op(client.21276.0:3884929 rb.0.31c7.238e1 f29.000000005614 [write > 3137536~65536] 4.63e147e e12271) v4 > >> 2014-05-14 22:12:24.289987 7f6185d6e700 -1 os/FileStore.cc: In function >> 'unsigned int FileStore::_do_transaction(ObjectStore::Transaction&, >> uint64_t, int, ThreadPool::TPHandle*)' thre ad 7f6185d6e700 time 2014-05-14 >> 22:12:24.282488 > > os/FileStore.cc: 2448: FAILED assert(0 == "unexpected error") > > ceph version 0.72.2 (a913ded2ff138aefb8cb84d347d72164099cfd60) > > 1: (FileStore::_do_transaction(ObjectStore::Transaction&, unsigned long, > int, ThreadPool::TPHandle*)+0x11c3) [0x723a43] > > 2: (FileStore::_do_transactions(std::list<ObjectStore::Transaction*, > std::allocator<ObjectStore::Transaction*> >&, unsigned long, > ThreadPool::TPHandle*)+0x74) [0x72a4d4] > > 3: (FileStore::_do_op(FileStore::OpSequencer*, ThreadPool::TPHandle&)+0x29a) > [0x72a78a] > > 4: (ThreadPool::worker(ThreadPool::WorkThread*)+0x551) [0x988f21] > > 5: (ThreadPool::WorkThread::entry()+0x10) [0x98bf50] > > 6: /lib64/libpthread.so.0() [0x3a7ce079d1] > > 7: (clone()+0x6d) [0x3a7cae8b6d] > > NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to > interpret this.???.. > > > > > > #iostat > > avg-cpu: %user %nice %system %iowait %steal %idle > > 0.44 0.00 0.14 0.41 0.00 99.01 > > Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn > > sdb 1.23 0.10 35.72 12738 4762008 > > sdc 5.25 214.25 1288.81 28564314 171824232 > > sdd 4.16 139.98 1021.69 18662490 136211888 > > sde 4.61 207.50 1039.20 27663258 138545960 > > sdf 7.94 203.24 2530.63 27095930 337383704 > > sdg 4.77 0.57 1459.29 75330 194553064 > > sdh 4.38 0.37 1287.42 48954 171638304 > > sdi 85.80 132.13 8157.53 17616004 1087562272 > > sdj 8.77 10.99 1701.90 1465844 226897024 > > sda 4.55 0.60 1331.50 80010 177516216 > > > > > > osd log attached. > > > > Wei Cao (Buddy) > > > > > _______________________________________________ > ceph-users mailing list > ceph-users at lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > -- Best Regards, Wheat