Thank your reply Our cluster are runing for two years in production,and it has no problem,so we don't upgrade. I check memory on host.Very little memory of free left.Does creating thread failure have anything to do with this? In addition to the kvm virtual machine, there are 22 osds on the host. free -m total used free shared buff/cache available Mem: 515420 178212 4323 729 332884 335360 Swap: 8191 8145 46 > sysctl: > kernel.pid_max=4194303 > kernel.threads-max=2097152 > vm.max_map_count=524288 > > But really, why are you still running Hammer? Later releases handle a large number of OSDs *much* better. > > > On Jun 1, 2020, at 7:08 PM, 展荣臻(信泰) <zhanrzh_xt@xxxxxxxxxxxxxx> wrote: > > > > Hi all, > > We have a hammer ceph cluster with 3 monitor,324 osds. OSD daemon and kvm is collocated on node; > > The ceph cluster are runing 2 years.Recently we added ~700 osds to the cluster,as process: > > 1.ceph osd create > > 2. mkdir -p /var/lib/ceph/osd/ceph-$osd > > 3. mkfs.xfs -f /dev/$disk > > 4. mount -o inode64,notime /dev/$disk /var/lib/ceph/osd/ceph-$osd > > 5. ceph-osd -i 0 --mkfs --mkkey > > 6.ceph auth add osd.$osd osd 'allow *' mon 'allow profile osd' -i /var/lib/ceph/osd/ceph-$osd/keyring > > 7.ceph osd crush create-or-move $osd host=kvm101 root=default > > Mabe we do that requently.After add 122 osds, osd.1-osd.8 failed > > > > 2020-05-14 16:48:29.881021 7f6727fb9700 -1 common/Thread.cc: In function 'void Thread::create(size_t)' thread 7f6727fb9700 time 2020-05-14 16:48:29.870051 > > common/Thread.cc: 129: FAILED assert(ret == 0) > > > > ceph version 0.94.5 (9764da52395923e0b32908d83a9f7304401fee43) > > 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x85) [0xbc8b55] > > 2: (Thread::create(unsigned long)+0x8a) [0xbac50a] > > 3: (Pipe::accept()+0x37fb) [0xca6c3b] > > 4: (Pipe::reader()+0x1a0f) [0xcaa75f] > > 5: (Pipe::Reader::entry()+0xd) [0xcb351d] > > 6: (()+0x7dc5) [0x7f67a45ebdc5] > > 7: (clone()+0x6d) [0x7f67a30cc1cd] > > NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this. > > > > ulimit -u > > 2061600 > > open files 32768 > > > > > > Does anyone know what's going on? Why create thread faild? > > > > > > > > > > > > > > _______________________________________________ > > ceph-users mailing list -- ceph-users@xxxxxxx > > To unsubscribe send an email to ceph-users-leave@xxxxxxx _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx