-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 18.09.15 14:47, Shinobu Kinjo wrote: > I do not think that it's best practice to increase that number > at the moment. It's kind of lack of consideration. > > We might need to do that as a result. > > But what we should do, first, is to check current actual number > of aio using: > > watch -dc cat /proc/sys/fs/aio-nr I did, it got up to about 138240 > then increase, if it's necessary. > > Anyway you have to be more careful otherwise there might be > back-and-force meaningless configuration change -; I'm sorry, I don't quite understand what you mean. Could you elaborate? Are there specific risks associated with a high setting of fs.aio-max-nr? FWIW, I've done some load testing (using rados bench and rados load-gen) -- anything I should watch out for in your opinion? Thanks, peter. > Shinobu > > ----- Original Message ----- From: "Peter Sabaini" > <peter@xxxxxxxxxx> To: ceph-users@xxxxxxxxxxxxxx Sent: > Thursday, September 17, 2015 11:51:11 PM Subject: Re: > ceph osd won't boot, resource shortage? > > On 16.09.15 16:41, Peter Sabaini wrote: >> Hi all, > >> I'm having trouble adding OSDs to a storage node; I've got >> about 28 OSDs running, but adding more fails. > > So, it seems the requisite knob was sysctl fs.aio-max-nr By > default, this was set to 64K here. I set it: > > # echo 2097152 > /proc/sys/fs/aio-max-nr > > This let me add my remaining OSDs. > > > >> Typical log excerpt: > >> 2015-09-16 13:55:58.083797 7f3e7b821800 1 journal _open >> /var/lib/ceph/osd/ceph-28/journal fd 20: 21474836480 bytes, >> block size 4096 bytes, directio = 1, aio = 1 2015-09-16 >> 13:55:58.090709 7f3e7b821800 -1 journal FileJournal::_open: >> unable to setup io_context (61) No data available 2015-09-16 >> 13:55:58.090825 7f3e74a96700 -1 journal io_submit to 0~4096 >> got (22) Invalid argument 2015-09-16 13:55:58.091061 >> 7f3e7b821800 1 journal close >> /var/lib/ceph/osd/ceph-28/journal 2015-09-16 13:55:58.091993 >> 7f3e74a96700 -1 os/FileJournal.cc: In function 'int >> FileJournal::write_aio_bl(off64_t&, ceph::bufferlist&, >> uint64_t)' thread 7f3e74a96700 time 2 015-09-16 >> 13:55:58.090842 os/FileJournal.cc: 1337: FAILED assert(0 == >> "io_submit got unexpected error") > >> More complete: http://pastebin.ubuntu.com/12427041/ > >> If, however, I stop one of the running OSDs, starting the >> original OSD works fine. I'm guessing I'm running out of >> resources somewhere, but where? > >> Some poss. relevant sysctl values: > >> vm.max_map_count=524288 kernel.pid_max=2097152 >> kernel.threads-max=2097152 fs.aio-max-nr = 65536 fs.aio-nr = >> 129024 fs.dentry-state = 75710 49996 45 0 0 0 fs.file-max = >> 26244198 fs.file-nr = 13504 0 26244198 fs.inode-nr = 60706 >> 202 fs.nr_open = 1048576 > >> I've also set max open files = 1048576 in ceph.conf > >> The OSDs are setup with dedicated journal disks - 3 OSDs >> share one journal device. > >> Any advice on what I'm missing, or where I should dig >> deeper? > >> Thanks, peter. > > > > > > >> _______________________________________________ ceph-users >> mailing list ceph-users@xxxxxxxxxxxxxx >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > > > _______________________________________________ ceph-users > mailing list ceph-users@xxxxxxxxxxxxxx > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > -----BEGIN PGP SIGNATURE----- iQIcBAEBAgAGBQJV/A/ZAAoJEDg5mUAO12PZ3tMP/06JdIoNf3DM00UPMHCZdZUm Uz5ZhQV7/Cc9ZurLkD1VSC/OAtTfIR99MJeoozczN6KKL6euGafUk1oJRuGlMst/ 1LDu28EbWmBn29k4szyLnqZZcj49JZFBDQ3zHEAAvPmmglQOeENooWoMbjjGb/+p wX6ANBOBkaVYbwmG8pRndab0DYdV/GBsTDDIbHVp4GnOwg/wOQriKIfRhHw1q4l6 KcGeZs84bhzfiqRQHHJXDieHAsUpKKUbLH0ofLxzCYOjrmpUgrHoVPV2YlNV0BYU WS2dJaOs0EwVK4iTdnb3B8VH11QsdKk0zCpC40+jaxU7Zn7THoMIURmDCIaI8OGB B1I4/Ima1Z6CMmPqDQIvebtnhdizgCpq11z6LRAb50TnNPnMuzIccyl5z013Sk8J JGG5/0sMDjE+apKx/bZdC+Q0TyJ8I49zcizo5qfHhvAqW51McTXEVspJy9ZlQvwK 2Q9bVZsdHBHbM6B45iILOel/K/ids6PzypzKMrwRDmsLI4NfB/fAvWcaWXW7GeQ0 fVbjEv9m12gWhJugJt5ue5JcRcnP8gdg2oG2kzAggGvqkaYrns2VwUXCux+wzkjw V418bjOWs78eHofmhhteitIItYDROYj9HSioDoaE15cqjOujn6N46PRRToY2eaGP s2LCkcql3hrWMBKp2h2D =GijP -----END PGP SIGNATURE----- _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com