Re: ceph osd won't boot, resource shortage?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Sorry for that. That's my fault.

 Disclaimer:
  This is about what I do always to do advanced investigation.
  This is NOT about common solution.
  Other experts have different solution.

So what you should do to know what's exactly going on on I/O layer.

   1.Install fio
   2.Change the following the parameter.

     /proc/sys/fs/aio-max-nr

     For the example, I used very low number: 5

     sudo sysctl -w fs.aio-max-nr=5

   3.Monitor:

     /proc/sys/fs/aio-nr

      while :
      do
          date
          cat /proc/sys/fs/aio-nr
          sleep 1
      done

   4.Do fio with:

     ioengine=libaio
     direct=1
     # man fio(1)
     # Number of clones (processes/threads 
     # performing the same workload) of this
     # job.
     numjobs=100 

After finishing fio job under that condition, you should be able to
see the followings

    // Monitor
    Sat Sep 19 08:14:10 JST 2015
    5
    Sat Sep 19 08:14:10 JST 2015
    5

    Meaning that your job got up to max aio which is allowed by the 
    kernel.

    // Fio

    fio: pid=29607, err=11/file:engines/libaio.c:273, func=io_queue_init, \
         error=Resource temporarily unavailable
    fio: check /proc/sys/fs/aio-max-nr

    Meaning that, since you set 5 for aio-max-nr, you ended up with
    lack of resource.

If you have any question, concern or whatever you have, just let
us know. 

Shinobu

----- Original Message -----
From: "Peter Sabaini" <peter@xxxxxxxxxx>
To: "Shinobu Kinjo" <skinjo@xxxxxxxxxx>
Cc: ceph-users@xxxxxxxxxxxxxx
Sent: Friday, September 18, 2015 10:21:29 PM
Subject: Re:  ceph osd won't boot, resource shortage?

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 18.09.15 14:47, Shinobu Kinjo wrote:
> I do not think that it's best practice to increase that number
> at the moment. It's kind of lack of consideration.
> 
> We might need to do that as a result.
> 
> But what we should do, first, is to check current actual number
> of aio using:
> 
> watch -dc cat /proc/sys/fs/aio-nr

I did, it got up to about 138240

> then increase, if it's necessary.
> 
> Anyway you have to be more careful otherwise there might be
> back-and-force meaningless configuration change -;

I'm sorry, I don't quite understand what you mean. Could you
elaborate? Are there specific risks associated with a high setting
of fs.aio-max-nr?

FWIW, I've done some load testing (using rados bench and rados
load-gen) -- anything I should watch out for in your opinion?


Thanks,
peter.


> Shinobu
> 
> ----- Original Message ----- From: "Peter Sabaini"
> <peter@xxxxxxxxxx> To: ceph-users@xxxxxxxxxxxxxx Sent:
> Thursday, September 17, 2015 11:51:11 PM Subject: Re:
>  ceph osd won't boot, resource shortage?
> 
> On 16.09.15 16:41, Peter Sabaini wrote:
>> Hi all,
> 
>> I'm having trouble adding OSDs to a storage node; I've got 
>> about 28 OSDs running, but adding more fails.
> 
> So, it seems the requisite knob was sysctl fs.aio-max-nr By
> default, this was set to 64K here. I set it:
> 
> # echo 2097152 > /proc/sys/fs/aio-max-nr
> 
> This let me add my remaining OSDs.
> 
> 
> 
>> Typical log excerpt:
> 
>> 2015-09-16 13:55:58.083797 7f3e7b821800  1 journal _open 
>> /var/lib/ceph/osd/ceph-28/journal fd 20: 21474836480 bytes, 
>> block size 4096 bytes, directio = 1, aio = 1 2015-09-16 
>> 13:55:58.090709 7f3e7b821800 -1 journal FileJournal::_open: 
>> unable to setup io_context (61) No data available 2015-09-16 
>> 13:55:58.090825 7f3e74a96700 -1 journal io_submit to 0~4096
>> got (22) Invalid argument 2015-09-16 13:55:58.091061
>> 7f3e7b821800 1 journal close
>> /var/lib/ceph/osd/ceph-28/journal 2015-09-16 13:55:58.091993
>> 7f3e74a96700 -1 os/FileJournal.cc: In function 'int
>> FileJournal::write_aio_bl(off64_t&, ceph::bufferlist&, 
>> uint64_t)' thread 7f3e74a96700 time 2 015-09-16 
>> 13:55:58.090842 os/FileJournal.cc: 1337: FAILED assert(0 == 
>> "io_submit got unexpected error")
> 
>> More complete: http://pastebin.ubuntu.com/12427041/
> 
>> If, however, I stop one of the running OSDs, starting the 
>> original OSD works fine. I'm guessing I'm running out of 
>> resources somewhere, but where?
> 
>> Some poss. relevant sysctl values:
> 
>> vm.max_map_count=524288 kernel.pid_max=2097152 
>> kernel.threads-max=2097152 fs.aio-max-nr = 65536 fs.aio-nr = 
>> 129024 fs.dentry-state = 75710	49996	45	0	0	0 fs.file-max = 
>> 26244198 fs.file-nr = 13504	0	26244198 fs.inode-nr = 60706
>> 202 fs.nr_open = 1048576
> 
>> I've also set max open files = 1048576 in ceph.conf
> 
>> The OSDs are setup with dedicated journal disks - 3 OSDs
>> share one journal device.
> 
>> Any advice on what I'm missing, or where I should dig
>> deeper?
> 
>> Thanks, peter.
> 
> 
> 
> 
> 
> 
>> _______________________________________________ ceph-users 
>> mailing list ceph-users@xxxxxxxxxxxxxx 
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> 
> 
> _______________________________________________ ceph-users
> mailing list ceph-users@xxxxxxxxxxxxxx 
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> 

-----BEGIN PGP SIGNATURE-----

iQIcBAEBAgAGBQJV/A/ZAAoJEDg5mUAO12PZ3tMP/06JdIoNf3DM00UPMHCZdZUm
Uz5ZhQV7/Cc9ZurLkD1VSC/OAtTfIR99MJeoozczN6KKL6euGafUk1oJRuGlMst/
1LDu28EbWmBn29k4szyLnqZZcj49JZFBDQ3zHEAAvPmmglQOeENooWoMbjjGb/+p
wX6ANBOBkaVYbwmG8pRndab0DYdV/GBsTDDIbHVp4GnOwg/wOQriKIfRhHw1q4l6
KcGeZs84bhzfiqRQHHJXDieHAsUpKKUbLH0ofLxzCYOjrmpUgrHoVPV2YlNV0BYU
WS2dJaOs0EwVK4iTdnb3B8VH11QsdKk0zCpC40+jaxU7Zn7THoMIURmDCIaI8OGB
B1I4/Ima1Z6CMmPqDQIvebtnhdizgCpq11z6LRAb50TnNPnMuzIccyl5z013Sk8J
JGG5/0sMDjE+apKx/bZdC+Q0TyJ8I49zcizo5qfHhvAqW51McTXEVspJy9ZlQvwK
2Q9bVZsdHBHbM6B45iILOel/K/ids6PzypzKMrwRDmsLI4NfB/fAvWcaWXW7GeQ0
fVbjEv9m12gWhJugJt5ue5JcRcnP8gdg2oG2kzAggGvqkaYrns2VwUXCux+wzkjw
V418bjOWs78eHofmhhteitIItYDROYj9HSioDoaE15cqjOujn6N46PRRToY2eaGP
s2LCkcql3hrWMBKp2h2D
=GijP
-----END PGP SIGNATURE-----
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux