Sage, does firefly require to manually set "ulimit -n" while add a new storage node with 16 osds(500G disks)? Wei Cao (Buddy) -----Original Message----- From: Sage Weil [mailto:sage@xxxxxxxxxxx] Sent: Thursday, May 15, 2014 10:49 PM To: Cao, Buddy Cc: ceph-users at ceph.com Subject: Re: osd down/autoout problem On Thu, 15 May 2014, Cao, Buddy wrote: > ?Too many open files not handled on operation 24 (541468.0.1, or op 1, > counting from 0) You need to increase the 'ulimit -n' max open files limit. You can do this in ceph.conf with 'max open files' if it's sysvinit or manually in /etc/init/ceph-osd.conf if its upstart. sage > > ?? -96> 2014-05-14 22:12:24.281185 7f617b33e700? 5 -- op tracker -- , seq: > 788808, time: 2014-05-14 22:12:24.281164, event: reached_pg, request: > ?osd_op(client.21276.0:3884815 rb.0.31c7.238e1f 29.000000003c15 [write > 2273280~65536] 4.110fcf4 e12271) v4 > > ??-95> 2014-05-14 22:12:24.281192 7f618556d700? 0 > filestore(/var/lib/ceph/osd/ceph-3) unexpected error code > > ?? -94> 2014-05-14 22:12:24.281197 7f6181b4b700? 5 -- op tracker -- , seq: > 788843, time: 2014-05-14 22:12:24.281011, event: header_read, request: > > osd_op(client.21276.0:3884929 rb.0.31c7.238e1 f29.000000005614 [write > 3137536~65536] 4.63e147e e12271) v4 > > > 2014-05-14 22:12:24.289987 7f6185d6e700 -1 os/FileStore.cc: In > > function > 'unsigned int FileStore::_do_transaction(ObjectStore::Transaction&, > uint64_t, int, ThreadPool::TPHandle*)' thre ad 7f6185d6e700 time > 2014-05-14 > 22:12:24.282488 > > os/FileStore.cc: 2448: FAILED assert(0 == "unexpected error") > > ?ceph version 0.72.2 (a913ded2ff138aefb8cb84d347d72164099cfd60) > > 1: (FileStore::_do_transaction(ObjectStore::Transaction&, unsigned > long, int, ThreadPool::TPHandle*)+0x11c3) [0x723a43] > > 2: (FileStore::_do_transactions(std::list<ObjectStore::Transaction*, > std::allocator<ObjectStore::Transaction*> >&, unsigned long, > ThreadPool::TPHandle*)+0x74) [0x72a4d4] > > 3: (FileStore::_do_op(FileStore::OpSequencer*, > ThreadPool::TPHandle&)+0x29a) [0x72a78a] > > 4: (ThreadPool::worker(ThreadPool::WorkThread*)+0x551) [0x988f21] > > 5: (ThreadPool::WorkThread::entry()+0x10) [0x98bf50] > > 6: /lib64/libpthread.so.0() [0x3a7ce079d1] > > 7: (clone()+0x6d) [0x3a7cae8b6d] > > NOTE: a copy of the executable, or `objdump -rdS <executable>` is > needed to interpret this.???.. > > ? > > ? > > #iostat > > avg-cpu:? %user?? %nice %system %iowait? %steal?? %idle > > ?????????? 0.44??? 0.00??? 0.14??? 0.41??? 0.00?? 99.01 > > Device:??????????? tps?? Blk_read/s?? Blk_wrtn/s?? Blk_read?? Blk_wrtn > > sdb?????????????? 1.23???????? 0.10??????? 35.72????? 12738??? 4762008 > > sdc?????????????? 5.25?????? 214.25????? 1288.81?? 28564314? 171824232 > > sdd?????????????? 4.16?????? 139.98????? 1021.69?? 18662490? 136211888 > > sde?????????????? 4.61?????? 207.50????? 1039.20?? 27663258? 138545960 > > sdf?????????????? 7.94?????? 203.24????? 2530.63?? 27095930? 337383704 > > sdg?????????????? 4.77???????? 0.57????? 1459.29????? 75330? 194553064 > > sdh?????????????? 4.38???????? 0.37????? 1287.42????? 48954? 171638304 > > sdi????????????? 85.80?????? 132.13????? 8157.53?? 17616004 1087562272 > > sdj?????????????? 8.77??????? 10.99????? 1701.90??? 1465844? 226897024 > > sda?????????????? 4.55???????? 0.60????? 1331.50????? 80010? 177516216 > > ? > > ? > > Wei Cao (Buddy) > > ? > > >