osd down/autoout problem

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Sage, does firefly require to manually set "ulimit -n" while add a new storage node with 16 osds(500G disks)? 


Wei Cao (Buddy)

-----Original Message-----
From: Sage Weil [mailto:sage@xxxxxxxxxxx] 
Sent: Thursday, May 15, 2014 10:49 PM
To: Cao, Buddy
Cc: ceph-users at ceph.com
Subject: Re: osd down/autoout problem

On Thu, 15 May 2014, Cao, Buddy wrote:
> ?Too many open files not handled on operation 24 (541468.0.1, or op 1, 
> counting from 0)

You need to increase the 'ulimit -n' max open files limit.  You can do this in ceph.conf with 'max open files' if it's sysvinit or manually in /etc/init/ceph-osd.conf if its upstart.

sage


> 
> ?? -96> 2014-05-14 22:12:24.281185 7f617b33e700? 5 -- op tracker -- , seq:
> 788808, time: 2014-05-14 22:12:24.281164, event: reached_pg, request:
> ?osd_op(client.21276.0:3884815 rb.0.31c7.238e1f 29.000000003c15 [write 
> 2273280~65536] 4.110fcf4 e12271) v4
> 
> ??-95> 2014-05-14 22:12:24.281192 7f618556d700? 0
> filestore(/var/lib/ceph/osd/ceph-3) unexpected error code
> 
> ?? -94> 2014-05-14 22:12:24.281197 7f6181b4b700? 5 -- op tracker -- , seq:
> 788843, time: 2014-05-14 22:12:24.281011, event: header_read, request:
> 
> osd_op(client.21276.0:3884929 rb.0.31c7.238e1 f29.000000005614 [write 
> 3137536~65536] 4.63e147e e12271) v4
> 
> > 2014-05-14 22:12:24.289987 7f6185d6e700 -1 os/FileStore.cc: In 
> > function
> 'unsigned int FileStore::_do_transaction(ObjectStore::Transaction&,
> uint64_t, int, ThreadPool::TPHandle*)' thre ad 7f6185d6e700 time 
> 2014-05-14
> 22:12:24.282488
> 
> os/FileStore.cc: 2448: FAILED assert(0 == "unexpected error")
> 
> ?ceph version 0.72.2 (a913ded2ff138aefb8cb84d347d72164099cfd60)
> 
> 1: (FileStore::_do_transaction(ObjectStore::Transaction&, unsigned 
> long, int, ThreadPool::TPHandle*)+0x11c3) [0x723a43]
> 
> 2: (FileStore::_do_transactions(std::list<ObjectStore::Transaction*,
> std::allocator<ObjectStore::Transaction*> >&, unsigned long,
> ThreadPool::TPHandle*)+0x74) [0x72a4d4]
> 
> 3: (FileStore::_do_op(FileStore::OpSequencer*, 
> ThreadPool::TPHandle&)+0x29a) [0x72a78a]
> 
> 4: (ThreadPool::worker(ThreadPool::WorkThread*)+0x551) [0x988f21]
> 
> 5: (ThreadPool::WorkThread::entry()+0x10) [0x98bf50]
> 
> 6: /lib64/libpthread.so.0() [0x3a7ce079d1]
> 
> 7: (clone()+0x6d) [0x3a7cae8b6d]
> 
> NOTE: a copy of the executable, or `objdump -rdS <executable>` is 
> needed to interpret this.???..
> 
> ?
> 
> ?
> 
> #iostat
> 
> avg-cpu:? %user?? %nice %system %iowait? %steal?? %idle
> 
> ?????????? 0.44??? 0.00??? 0.14??? 0.41??? 0.00?? 99.01
> 
> Device:??????????? tps?? Blk_read/s?? Blk_wrtn/s?? Blk_read?? Blk_wrtn
> 
> sdb?????????????? 1.23???????? 0.10??????? 35.72????? 12738??? 4762008
> 
> sdc?????????????? 5.25?????? 214.25????? 1288.81?? 28564314? 171824232
> 
> sdd?????????????? 4.16?????? 139.98????? 1021.69?? 18662490? 136211888
> 
> sde?????????????? 4.61?????? 207.50????? 1039.20?? 27663258? 138545960
> 
> sdf?????????????? 7.94?????? 203.24????? 2530.63?? 27095930? 337383704
> 
> sdg?????????????? 4.77???????? 0.57????? 1459.29????? 75330? 194553064
> 
> sdh?????????????? 4.38???????? 0.37????? 1287.42????? 48954? 171638304
> 
> sdi????????????? 85.80?????? 132.13????? 8157.53?? 17616004 1087562272
> 
> sdj?????????????? 8.77??????? 10.99????? 1701.90??? 1465844? 226897024
> 
> sda?????????????? 4.55???????? 0.60????? 1331.50????? 80010? 177516216
> 
> ?
> 
> ?
> 
> Wei Cao (Buddy)
> 
> ?
> 
> 
> 


[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux