Re: OSDs too slow to start

Alfredo Daniel Rezinovsky <alfredo.rezinovsky@xxxxxxxxxxxxxxxxxxxxxxxx> · Fri, 15 Jun 2018 12:59:02 -0300



    Too long is 120 seconds
    The DB is in SSD devices. The devices are fast. The process OSD
      reads about 800Mb but I cannot be sure from where.

    
    On 13/06/18 11:36, Gregory Farnum
      wrote:

    
        How long is “too long”? 800MB on an SSD should
          only be a second or three.
      
      I’m not sure if that’s a reasonable amount of
        data; you could try compacting the rocksdb instance etc. But if
        reading 800MB is noticeable I would start wondering about the
        quality of your disks as a journal or rocksdb device.
      -Greg
      

          On Tue, Jun 12, 2018 at 2:23 PM Alfredo Daniel Rezinovsky
            <alfredo.rezinovsky@xxxxxxxxxxxxxxxxxxxxxxxx>
            wrote:

          
          I migrated
            my OSDs from filestore to bluestore.

            
            Each node now has 1 SSD with the OS and the BlockDBs and 3
            HDDs with 

            bluestore data.

            
            # lsblk

            NAME   MAJ:MIN RM   SIZE RO TYPE MOUNTPOINT

            sdd      8:48   0   2.7T  0 disk

            |-sdd2   8:50   0   2.7T  0 part

            `-sdd1   8:49   0   100M  0 part /var/lib/ceph/osd/ceph-2

            sdb      8:16   0   3.7T  0 disk

            |-sdb2   8:18   0   3.7T  0 part

            `-sdb1   8:17   0   100M  0 part /var/lib/ceph/osd/ceph-0

            sdc      8:32   0   3.7T  0 disk

            |-sdc2   8:34   0   3.7T  0 part

            `-sdc1   8:33   0   100M  0 part /var/lib/ceph/osd/ceph-1

            sda      8:0    0 223.6G  0 disk

            |-sda4   8:4    0     1G  0 part

            |-sda2   8:2    0  37.3G  0 part /

            |-sda5   8:5    0     1G  0 part

            |-sda3   8:3    0     1G  0 part

            `-sda1   8:1    0   953M  0 part /boot/efi

            
            Now the I/O works better, and I never saw again a slow
            response (OSD not 

            MDS) warning.

            
            But when I reboot a ceph node the OSDs takes too long to get
            up. With 

            filestore it was almost inmediate.

            
            Monitoring /proc/$(pidod ceph-osd)/io I could see that each
            OSD reads 

            about 800 MBytes before getting up (My block.db partitions
            are 1G).

            
            Does the OSDs start re-process all the block.db when booting
            up?

            
            There's any way to accelerate the OSD availability after a
            reboot?

            
            _______________________________________________

            ceph-users mailing list

            ceph-users@xxxxxxxxxxxxxx

            http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

          
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com