Re: ceph-volume: failed to activate some bluestore osds

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Jun 7, 2018 at 10:54 AM, Dan van der Ster <dan@xxxxxxxxxxxxxx> wrote:
> On Thu, Jun 7, 2018 at 4:41 PM Sage Weil <sweil@xxxxxxxxxx> wrote:
>>
>> On Thu, 7 Jun 2018, Dan van der Ster wrote:
>> > On Thu, Jun 7, 2018 at 4:33 PM Sage Weil <sweil@xxxxxxxxxx> wrote:
>> > >
>> > > On Thu, 7 Jun 2018, Dan van der Ster wrote:
>> > > > Hi all,
>> > > >
>> > > > We have an intermittent issue where bluestore osds sometimes fail to
>> > > > start after a reboot.
>> > > > The osds all fail the same way [see 2], failing to open the superblock.
>> > > > One one particular host, there are 24 osds and 4 SSDs partitioned for
>> > > > the block.db's. The affected non-starting OSDs all have block.db on
>> > > > the same ssd (/dev/sdaa).
>> > > >
>> > > > The osds are all running 12.2.5 on latest centos 7.5 and were created
>> > > > by ceph-volume lvm, e.g. see [1].
>> > > >
>> > > > This seems like a permissions or similar issue related to the
>> > > > ceph-volume tooling.
>> > > > Any clues how to debug this further?
>> > >
>> > > I take it the OSDs start up if you try again?
>> >
>> > Hey.
>> > No, they don't. For example, we do this `ceph-volume lvm activate 48
>> > 99fd8e36-fc4d-4bbc-83d9-f5e611cde4b5` several times and its the same
>> > mount failure every time.
>>
>> That sounds like a bluefs bug then, not a ceph-volume issue.  Can you
>> try to start the OSD will logging enabled?  (debug bluefs = 20,
>> debug bluestore = 20)
>>
>
> Here: https://pastebin.com/TJXZhfcY
>
> Is it supposed to print something about the block.db at some point????

This has to be some logging mistake because it is block.db, never just 'block' :

bdev(0x5653ffdadc00 /var/lib/ceph/osd/ceph-48/block) open path
/var/lib/ceph/osd/ceph-48/block

That is what you are referring to here right?

Now, re-reading the thread, you say that it sometimes does boot
normally? ceph-volume tries (in different ways) to ensure that the
devices
used are the correct ones. In the case of /dev/sdaa1 it has persisted
the partuuid (3381a121-1c1b-4e45-a986-c1871c363edc) which is later
queried using blkid to find the right device name (/dev/sdaa1 in your case).

Is it possible that you are seeing somewhere where ceph-volume is
*not* matching this correctly? If osd.48 comes up online, how does the
/var/lib/osd/ceph-48 looks? the same?


>
> Here's the osd dir:
>
> # ls -l /var/lib/ceph/osd/ceph-48/
> total 24
> lrwxrwxrwx. 1 ceph ceph 93 Jun  7 16:46 block ->
> /dev/ceph-34f24306-d90c-49ff-bafb-2657a6a18010/osd-block-99fd8e36-fc4d-4bbc-83d9-f5e611cde4b5
> lrwxrwxrwx. 1 root root 10 Jun  7 16:46 block.db -> /dev/sdaa1
> -rw-------. 1 ceph ceph 37 Jun  7 16:46 ceph_fsid
> -rw-------. 1 ceph ceph 37 Jun  7 16:46 fsid
> -rw-------. 1 ceph ceph 56 Jun  7 16:46 keyring
> -rw-------. 1 ceph ceph  6 Jun  7 16:46 ready
> -rw-------. 1 ceph ceph 10 Jun  7 16:46 type
> -rw-------. 1 ceph ceph  3 Jun  7 16:46 whoami
>
> # ls -l /dev/ceph-34f24306-d90c-49ff-bafb-2657a6a18010/osd-block-99fd8e36-fc4d-4bbc-83d9-f5e611cde4b5
> lrwxrwxrwx. 1 root root 7 Jun  7 16:46
> /dev/ceph-34f24306-d90c-49ff-bafb-2657a6a18010/osd-block-99fd8e36-fc4d-4bbc-83d9-f5e611cde4b5
> -> ../dm-4
>
> # ls -l /dev/dm-4
> brw-rw----. 1 ceph ceph 253, 4 Jun  7 16:46 /dev/dm-4
>
>
>   --- Logical volume ---
>   LV Path
> /dev/ceph-34f24306-d90c-49ff-bafb-2657a6a18010/osd-block-99fd8e36-fc4d-4bbc-83d9-f5e611cde4b5
>   LV Name                osd-block-99fd8e36-fc4d-4bbc-83d9-f5e611cde4b5
>   VG Name                ceph-34f24306-d90c-49ff-bafb-2657a6a18010
>   LV UUID                FQkRxS-No7X-ajkP-5L3N-K22a-IXg6-QLceZC
>   LV Write Access        read/write
>   LV Creation host, time p06253939y61826.cern.ch, 2018-03-15 10:57:37 +0100
>   LV Status              available
>   # open                 0
>   LV Size                <5.46 TiB
>   Current LE             1430791
>   Segments               1
>   Allocation             inherit
>   Read ahead sectors     auto
>   - currently set to     256
>   Block device           253:4
>
>   --- Physical volume ---
>   PV Name               /dev/sda
>   VG Name               ceph-34f24306-d90c-49ff-bafb-2657a6a18010
>   PV Size               <5.46 TiB / not usable <2.59 MiB
>   Allocatable           yes (but full)
>   PE Size               4.00 MiB
>   Total PE              1430791
>   Free PE               0
>   Allocated PE          1430791
>   PV UUID               WP0Z7C-ejSh-fpSa-a73N-H2Hz-yC78-qBezcI
>
> (sorry for wall o' lvm)
>
> -- dan
>
>
>
>
>
>
>
>
>> Thanks!
>> sage
>>
>>
>> > -- dan
>> >
>> >
>> > >
>> > > sage
>> > >
>> > >
>> > > >
>> > > > Thanks!
>> > > >
>> > > > Dan
>> > > >
>> > > > [1]
>> > > >
>> > > > ====== osd.48 ======
>> > > >
>> > > >   [block]    /dev/ceph-34f24306-d90c-49ff-bafb-2657a6a18010/osd-block-99fd8e36-fc4d-4bbc-83d9-f5e611cde4b5
>> > > >
>> > > >       type                      block
>> > > >       osd id                    48
>> > > >       cluster fsid              dd535a7e-4647-4bee-853d-f34112615f81
>> > > >       cluster name              ceph
>> > > >       osd fsid                  99fd8e36-fc4d-4bbc-83d9-f5e611cde4b5
>> > > >       db device                 /dev/sdaa1
>> > > >       encrypted                 0
>> > > >       db uuid                   3381a121-1c1b-4e45-a986-c1871c363edc
>> > > >       cephx lockbox secret
>> > > >       block uuid                FQkRxS-No7X-ajkP-5L3N-K22a-IXg6-QLceZC
>> > > >       block device
>> > > > /dev/ceph-34f24306-d90c-49ff-bafb-2657a6a18010/osd-block-99fd8e36-fc4d-4bbc-83d9-f5e611cde4b5
>> > > >       crush device class        None
>> > > >
>> > > >   [  db]    /dev/sdaa1
>> > > >
>> > > >       PARTUUID                  3381a121-1c1b-4e45-a986-c1871c363edc
>> > > >
>> > > >
>> > > >
>> > > > [2]
>> > > >    -11> 2018-06-07 16:12:16.138407 7fba30fb4d80  1 -- - start start
>> > > >    -10> 2018-06-07 16:12:16.138516 7fba30fb4d80  1
>> > > > bluestore(/var/lib/ceph/osd/ceph-48) _mount path /var/lib/ceph/os
>> > > > d/ceph-48
>> > > >     -9> 2018-06-07 16:12:16.138801 7fba30fb4d80  1 bdev create path
>> > > > /var/lib/ceph/osd/ceph-48/block type kernel
>> > > >     -8> 2018-06-07 16:12:16.138808 7fba30fb4d80  1 bdev(0x55eb46433a00
>> > > > /var/lib/ceph/osd/ceph-48/block) open path /v
>> > > > ar/lib/ceph/osd/ceph-48/block
>> > > >     -7> 2018-06-07 16:12:16.138999 7fba30fb4d80  1 bdev(0x55eb46433a00
>> > > > /var/lib/ceph/osd/ceph-48/block) open size 60
>> > > > 01172414464 (0x57541c00000, 5589 GB) block_size 4096 (4096 B) rotational
>> > > >     -6> 2018-06-07 16:12:16.139188 7fba30fb4d80  1
>> > > > bluestore(/var/lib/ceph/osd/ceph-48) _set_cache_sizes cache_size
>> > > > 134217728 meta 0.01 kv 0.99 data 0
>> > > >     -5> 2018-06-07 16:12:16.139275 7fba30fb4d80  1 bdev create path
>> > > > /var/lib/ceph/osd/ceph-48/block type kernel
>> > > >     -4> 2018-06-07 16:12:16.139281 7fba30fb4d80  1 bdev(0x55eb46433c00
>> > > > /var/lib/ceph/osd/ceph-48/block) open path /v
>> > > > ar/lib/ceph/osd/ceph-48/block
>> > > >     -3> 2018-06-07 16:12:16.139454 7fba30fb4d80  1 bdev(0x55eb46433c00
>> > > > /var/lib/ceph/osd/ceph-48/block) open size 60
>> > > > 01172414464 (0x57541c00000, 5589 GB) block_size 4096 (4096 B) rotational
>> > > >     -2> 2018-06-07 16:12:16.139464 7fba30fb4d80  1 bluefs
>> > > > add_block_device bdev 1 path /var/lib/ceph/osd/ceph-48/blo
>> > > > ck size 5589 GB
>> > > >     -1> 2018-06-07 16:12:16.139510 7fba30fb4d80  1 bluefs mount
>> > > >      0> 2018-06-07 16:12:16.142930 7fba30fb4d80 -1
>> > > > /home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILA
>> > > > BLE_ARCH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/release/12.2.5/rpm/el7/BUILD/ceph-12.2.5/src/o
>> > > > s/bluestore/bluefs_types.h: In function 'static void
>> > > > bluefs_fnode_t::_denc_finish(ceph::buffer::ptr::iterator&, __u8
>> > > > *, __u8*, char**, uint32_t*)' thread 7fba30fb4d80 time 2018-06-07
>> > > > 16:12:16.139666
>> > > > /home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/release/12.2.5/rpm/el7/BUILD/ceph-12.2.5/src/os/bluestore/bluefs_types.h:
>> > > > 54: FAILED assert(pos <= end)
>> > > >
>> > > >  ceph version 12.2.5 (cad919881333ac92274171586c827e01f554a70a)
>> > > > luminous (stable)
>> > > >  1: (ceph::__ceph_assert_fail(char const*, char const*, int, char
>> > > > const*)+0x110) [0x55eb3b597780]
>> > > >  2: (bluefs_super_t::decode(ceph::buffer::list::iterator&)+0x776)
>> > > > [0x55eb3b52db36]
>> > > >  3: (BlueFS::_open_super()+0xfe) [0x55eb3b50cede]
>> > > >  4: (BlueFS::mount()+0xe3) [0x55eb3b5250c3]
>> > > >  5: (BlueStore::_open_db(bool)+0x173d) [0x55eb3b43ebcd]
>> > > >  6: (BlueStore::_mount(bool)+0x40e) [0x55eb3b47025e]
>> > > >  7: (OSD::init()+0x3bd) [0x55eb3b02a1cd]
>> > > >  8: (main()+0x2d07) [0x55eb3af2f977]
>> > > >  9: (__libc_start_main()+0xf5) [0x7fba2d47b445]
>> > > >  10: (()+0x4b7033) [0x55eb3afce033]
>> > > >  NOTE: a copy of the executable, or `objdump -rdS <executable>` is
>> > > > needed to interpret this.
>> > > > _______________________________________________
>> > > > ceph-users mailing list
>> > > > ceph-users@xxxxxxxxxxxxxx
>> > > > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>> > > >
>> > > >
>> > _______________________________________________
>> > ceph-users mailing list
>> > ceph-users@xxxxxxxxxxxxxx
>> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>> >
>> >
> _______________________________________________
> ceph-users mailing list
> ceph-users@xxxxxxxxxxxxxx
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux