osd down/autoout problem

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

One of the osd in my cluster downs w no reason, I saw the error message in the log below, I restarted osd, but after several hours, the problem come back again. Could you help?

"Too many open files not handled on operation 24 (541468.0.1, or op 1, counting from 0)
   -96> 2014-05-14 22:12:24.281185 7f617b33e700  5 -- op tracker -- , seq: 788808, time: 2014-05-14 22:12:24.281164, event: reached_pg, request:  osd_op(client.21276.0:3884815 rb.0.31c7.238e1f 29.000000003c15 [write 2273280~65536] 4.110fcf4 e12271) v4
  -95> 2014-05-14 22:12:24.281192 7f618556d700  0 filestore(/var/lib/ceph/osd/ceph-3) unexpected error code
   -94> 2014-05-14 22:12:24.281197 7f6181b4b700  5 -- op tracker -- , seq: 788843, time: 2014-05-14 22:12:24.281011, event: header_read, request:
osd_op(client.21276.0:3884929 rb.0.31c7.238e1 f29.000000005614 [write 3137536~65536] 4.63e147e e12271) v4
> 2014-05-14 22:12:24.289987 7f6185d6e700 -1 os/FileStore.cc: In function 'unsigned int FileStore::_do_transaction(ObjectStore::Transaction&, uint64_t, int, ThreadPool::TPHandle*)' thre ad 7f6185d6e700 time 2014-05-14 22:12:24.282488
os/FileStore.cc: 2448: FAILED assert(0 == "unexpected error")
 ceph version 0.72.2 (a913ded2ff138aefb8cb84d347d72164099cfd60)
1: (FileStore::_do_transaction(ObjectStore::Transaction&, unsigned long, int, ThreadPool::TPHandle*)+0x11c3) [0x723a43]
2: (FileStore::_do_transactions(std::list<ObjectStore::Transaction*, std::allocator<ObjectStore::Transaction*> >&, unsigned long, ThreadPool::TPHandle*)+0x74) [0x72a4d4]
3: (FileStore::_do_op(FileStore::OpSequencer*, ThreadPool::TPHandle&)+0x29a) [0x72a78a]
4: (ThreadPool::worker(ThreadPool::WorkThread*)+0x551) [0x988f21]
5: (ThreadPool::WorkThread::entry()+0x10) [0x98bf50]
6: /lib64/libpthread.so.0() [0x3a7ce079d1]
7: (clone()+0x6d) [0x3a7cae8b6d]
NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this."........


#iostat
avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           0.44    0.00    0.14    0.41    0.00   99.01
Device:            tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn
sdb               1.23         0.10        35.72      12738    4762008
sdc               5.25       214.25      1288.81   28564314  171824232
sdd               4.16       139.98      1021.69   18662490  136211888
sde               4.61       207.50      1039.20   27663258  138545960
sdf               7.94       203.24      2530.63   27095930  337383704
sdg               4.77         0.57      1459.29      75330  194553064
sdh               4.38         0.37      1287.42      48954  171638304
sdi              85.80       132.13      8157.53   17616004 1087562272
sdj               8.77        10.99      1701.90    1465844  226897024
sda               4.55         0.60      1331.50      80010  177516216


osd log attached.

Wei Cao (Buddy)

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ceph.com/pipermail/ceph-users-ceph.com/attachments/20140515/6e417eb6/attachment-0001.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: ceph-osd.21_short.log
Type: application/octet-stream
Size: 3928021 bytes
Desc: ceph-osd.21_short.log
URL: <http://lists.ceph.com/pipermail/ceph-users-ceph.com/attachments/20140515/6e417eb6/attachment-0001.obj>


[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux