Hi all, I'm having trouble adding OSDs to a storage node; I've got about 28 OSDs running, but adding more fails. Typical log excerpt: 2015-09-16 13:55:58.083797 7f3e7b821800 1 journal _open /var/lib/ceph/osd/ceph-28/journal fd 20: 21474836480 bytes, block size 4096 bytes, directio = 1, aio = 1 2015-09-16 13:55:58.090709 7f3e7b821800 -1 journal FileJournal::_open: unable to setup io_context (61) No data available 2015-09-16 13:55:58.090825 7f3e74a96700 -1 journal io_submit to 0~4096 got (22) Invalid argument 2015-09-16 13:55:58.091061 7f3e7b821800 1 journal close /var/lib/ceph/osd/ceph-28/journal 2015-09-16 13:55:58.091993 7f3e74a96700 -1 os/FileJournal.cc: In function 'int FileJournal::write_aio_bl(off64_t&, ceph::bufferlist&, uint64_t)' thread 7f3e74a96700 time 2 015-09-16 13:55:58.090842 os/FileJournal.cc: 1337: FAILED assert(0 == "io_submit got unexpected error") More complete: http://pastebin.ubuntu.com/12427041/ If, however, I stop one of the running OSDs, starting the original OSD works fine. I'm guessing I'm running out of resources somewhere, but where? Some poss. relevant sysctl values: vm.max_map_count=524288 kernel.pid_max=2097152 kernel.threads-max=2097152 fs.aio-max-nr = 65536 fs.aio-nr = 129024 fs.dentry-state = 75710 49996 45 0 0 0 fs.file-max = 26244198 fs.file-nr = 13504 0 26244198 fs.inode-nr = 60706 202 fs.nr_open = 1048576 I've also set max open files = 1048576 in ceph.conf The OSDs are setup with dedicated journal disks - 3 OSDs share one journal device. Any advice on what I'm missing, or where I should dig deeper? Thanks, peter.
Attachment:
signature.asc
Description: OpenPGP digital signature
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com