Re: is it possible to remove the db+wal from an external device (nvme)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



No,

that's just backtrace of the crash - I'd like to see the full OSD log from the process startup till the crash instead...

On 10/8/2021 4:02 PM, Szabo, Istvan (Agoda) wrote:

Hi Igor,

Here is a bluestore tool fsck output:

https://justpaste.it/7igrb <https://justpaste.it/7igrb>

Is this that you are looking for?

Istvan Szabo
Senior Infrastructure Engineer
---------------------------------------------------
Agoda Services Co., Ltd.
e: istvan.szabo@xxxxxxxxx <mailto:istvan.szabo@xxxxxxxxx>
---------------------------------------------------

*From:*Igor Fedotov <ifedotov@xxxxxxx>
*Sent:* Tuesday, October 5, 2021 10:02 PM
*To:* Szabo, Istvan (Agoda) <Istvan.Szabo@xxxxxxxxx>; 胡玮文<huww98@xxxxxxxxxxx>
*Cc:* ceph-users@xxxxxxx; Eugen Block <eblock@xxxxxx>
*Subject:* Re: Re: is it possible to remove the db+wal from an external device (nvme)

Email received from the internet. If in doubt, don't click any link nor open any attachment !

------------------------------------------------------------------------

Not sure dmcrypt is a culprit here.

Could you please set debug-bluefs to 20 and collect an OSD startup log.

On 10/5/2021 4:43 PM, Szabo, Istvan (Agoda) wrote:

    Hmm, tried another one which hasn’t been spilledover disk, still
    coredumped ☹

    Is there any special thing that we need to do before we migrate db
    next to the block? Our osds are using dmcrypt, is it an issue?

    {

    "backtrace": [

    "(()+0x12b20) [0x7f310aa49b20]",

    "(gsignal()+0x10f) [0x7f31096aa37f]",

    "(abort()+0x127) [0x7f3109694db5]",

    "(()+0x9009b) [0x7f310a06209b]",

    "(()+0x9653c) [0x7f310a06853c]",

    "(()+0x95559) [0x7f310a067559]",

    "(__gxx_personality_v0()+0x2a8) [0x7f310a067ed8]",

    "(()+0x10b03) [0x7f3109a48b03]",

    "(_Unwind_RaiseException()+0x2b1) [0x7f3109a49071]",

    "(__cxa_throw()+0x3b) [0x7f310a0687eb]",

    "(()+0x19fa4) [0x7f310b7b6fa4]",

    "(tcmalloc::allocate_full_cpp_throw_oom(unsigned long)+0x146)
    [0x7f310b7d8c96]",

    "(()+0x10d0f8e) [0x55ffa520df8e]",

      "(rocksdb::Version::~Version()+0x104) [0x55ffa521d174]",

    "(rocksdb::Version::Unref()+0x21) [0x55ffa521d221]",

    "(rocksdb::ColumnFamilyData::~ColumnFamilyData()+0x5a)
    [0x55ffa52efcca]",

    "(rocksdb::ColumnFamilySet::~ColumnFamilySet()+0x88)
    [0x55ffa52f0568]",

    "(rocksdb::VersionSet::~VersionSet()+0x5e) [0x55ffa520e01e]",

    "(rocksdb::VersionSet::~VersionSet()+0x11) [0x55ffa520e261]",

    "(rocksdb::DBImpl::CloseHelper()+0x616) [0x55ffa5155ed6]",

    "(rocksdb::DBImpl::~DBImpl()+0x83b) [0x55ffa515c35b]",

    "(rocksdb::DBImplReadOnly::~DBImplReadOnly()+0x11) [0x55ffa51a3bc1]",

    "(rocksdb::DB::OpenForReadOnly(rocksdb::DBOptions const&,
    std::__cxx11::basic_string<char, std::char_traits<char>,
    std::allocator<char> > const&,
    std::vector<rocksdb::ColumnFamilyDescriptor,
    std::allocator<rocksdb::ColumnFamilyDescriptor> > const&,
    std::vector<rocksdb::ColumnFamilyHandle*,
    std::allocator<rocksdb::ColumnFamilyHandle*> >*, rocksdb::DB**,
    bool)+0x1089) [0x55ffa51a57e9]",

    "(RocksDBStore::do_open(std::ostream&, bool, bool,
    std::vector<KeyValueDB::ColumnFamily,
    std::allocator<KeyValueDB::ColumnFamily> > const*)+0x14ca)
    [0x55ffa51285ca]",

    "(BlueStore::_open_db(bool, bool, bool)+0x1314) [0x55ffa4bc27e4]",

        "(BlueStore::_open_db_and_around(bool)+0x4c) [0x55ffa4bd4c5c]",

    "(BlueStore::_mount(bool, bool)+0x847) [0x55ffa4c2e047]",

    "(OSD::init()+0x380) [0x55ffa4753a70]",

    "(main()+0x47f1) [0x55ffa46a6901]",

    "(__libc_start_main()+0xf3) [0x7f3109696493]",

    "(_start()+0x2e) [0x55ffa46d4e3e]"

    ],

    "ceph_version": "15.2.14",

    "crash_id":
    "2021-10-05T13:31:28.513463Z_b6818598-4960-4ed6-942a-d4a7ff37a758",

    "entity_name": "osd.48",

    "os_id": "centos",

    "os_name": "CentOS Linux",

    "os_version": "8",

    "os_version_id": "8",

    "process_name": "ceph-osd",

    "stack_sig":
    "6a43b6c219adac393b239fbea4a53ff87c4185bcd213724f0d721b452b81ddbf",

    "timestamp": "2021-10-05T13:31:28.513463Z",

    "utsname_hostname": "server-2s07",

    "utsname_machine": "x86_64",

    "utsname_release": "4.18.0-305.19.1.el8_4.x86_64",

    "utsname_sysname": "Linux",

    "utsname_version": "#1 SMP Wed Sep 15 15:39:39 UTC 2021"

    }

    Istvan Szabo
    Senior Infrastructure Engineer
    ---------------------------------------------------
    Agoda Services Co., Ltd.
    e: istvan.szabo@xxxxxxxxx <mailto:istvan.szabo@xxxxxxxxx>
    ---------------------------------------------------

    *From:*胡玮文<huww98@xxxxxxxxxxx> <mailto:huww98@xxxxxxxxxxx>
    *Sent:* Monday, October 4, 2021 12:13 AM
    *To:* Szabo, Istvan (Agoda) <Istvan.Szabo@xxxxxxxxx>
    <mailto:Istvan.Szabo@xxxxxxxxx>; Igor Fedotov <ifedotov@xxxxxxx>
    <mailto:ifedotov@xxxxxxx>
    *Cc:* ceph-users@xxxxxxx <mailto:ceph-users@xxxxxxx>
    *Subject:* 回复:  Re: is it possible to remove the
    db+wal from an external device (nvme)

    Email received from the internet. If in doubt, don't click any
    link nor open any attachment !

    ------------------------------------------------------------------------

    The stack trace (tcmalloc::allocate_full_cpp_throw_oom) seems
    indicating you don’t have enough memory.

    *发件人**: *Szabo, Istvan (Agoda) <mailto:Istvan.Szabo@xxxxxxxxx>
    *发送时间: *2021年10月4日 0:46
    *收件人: *Igor Fedotov <mailto:ifedotov@xxxxxxx>
    *抄送: *ceph-users@xxxxxxx <mailto:ceph-users@xxxxxxx>
    *主题: * Re: is it possible to remove the db+wal from an
    external device (nvme)

    Seems like it cannot start anymore once migrated ☹

    https://justpaste.it/5hkot <https://justpaste.it/5hkot>

    Istvan Szabo
    Senior Infrastructure Engineer
    ---------------------------------------------------
    Agoda Services Co., Ltd.
    e: istvan.szabo@xxxxxxxxx<mailto:istvan.szabo@xxxxxxxxx
    <mailto:istvan.szabo@xxxxxxxxx%3cmailto:istvan.szabo@xxxxxxxxx>>
    ---------------------------------------------------

    From: Igor Fedotov <ifedotov@xxxxxxx <mailto:ifedotov@xxxxxxx>>
    Sent: Saturday, October 2, 2021 5:22 AM
    To: Szabo, Istvan (Agoda) <Istvan.Szabo@xxxxxxxxx
    <mailto:Istvan.Szabo@xxxxxxxxx>>
    Cc: ceph-users@xxxxxxx <mailto:ceph-users@xxxxxxx>; Eugen Block
    <eblock@xxxxxx <mailto:eblock@xxxxxx>>; Christian Wuerdig
    <christian.wuerdig@xxxxxxxxx <mailto:christian.wuerdig@xxxxxxxxx>>
    Subject: Re:  Re: is it possible to remove the db+wal
    from an external device (nvme)

    Email received from the internet. If in doubt, don't click any
    link nor open any attachment !
    ________________________________

    Hi Istvan,

    yeah both db and wal to slow migration are supported. And
    spillover state isn't a show stopper for that.


    On 10/2/2021 1:16 AM, Szabo, Istvan (Agoda) wrote:
    Dear Igor,

    Is the ceph-volume lvm migrate command smart enough in octopus
    15.2.14 to be able to remove the db (included the wall) from the
    nvme even if it is spilledover? I can’t compact back to normal
    many disk to not show spillover warning.

    I think Christian has the truth of the issue, my
    Nvme with 30k rand write iops backing 3x ssd with 67k rand write
    iops each …

    Istvan Szabo
    Senior Infrastructure Engineer
    ---------------------------------------------------
    Agoda Services Co., Ltd.
    e: istvan.szabo@xxxxxxxxx<mailto:istvan.szabo@xxxxxxxxx
    <mailto:istvan.szabo@xxxxxxxxx%3cmailto:istvan.szabo@xxxxxxxxx>>
    ---------------------------------------------------


    On 2021. Oct 1., at 11:47, Szabo, Istvan (Agoda)
    <Istvan.Szabo@xxxxxxxxx
    <mailto:Istvan.Szabo@xxxxxxxxx>><mailto:Istvan.Szabo@xxxxxxxxx
    <mailto:Istvan.Szabo@xxxxxxxxx>> wrote:
    3x SSD osd /nvme

    Istvan Szabo
    Senior Infrastructure Engineer
    ---------------------------------------------------
    Agoda Services Co., Ltd.
    e: istvan.szabo@xxxxxxxxx<mailto:istvan.szabo@xxxxxxxxx
    <mailto:istvan.szabo@xxxxxxxxx%3cmailto:istvan.szabo@xxxxxxxxx>>
    ---------------------------------------------------

    -----Original Message-----
    From: Igor Fedotov <ifedotov@xxxxxxx
    <mailto:ifedotov@xxxxxxx>><mailto:ifedotov@xxxxxxx
    <mailto:ifedotov@xxxxxxx>>
    Sent: Friday, October 1, 2021 4:35 PM
    To: ceph-users@xxxxxxx
    <mailto:ceph-users@xxxxxxx><mailto:ceph-users@xxxxxxx
    <mailto:ceph-users@xxxxxxx>>
    Subject:  Re: is it possible to remove the db+wal from
    an external device (nvme)

    Email received from the internet. If in doubt, don't click any
    link nor open any attachment !
    ________________________________

    And how many OSDs are per single NVMe do you have?

    On 10/1/2021 9:55 AM, Szabo, Istvan (Agoda) wrote:

    I have my dashboards and I can see that the db nvmes are always
    running on 100% utilization (you can monitor with iostat -x 1)and
    it generates all the time iowaits which is between 1-3.

    I’m using nvme in front of the ssds.

    Istvan Szabo
    Senior Infrastructure Engineer
    ---------------------------------------------------
    Agoda Services Co., Ltd.
    e:
    istvan.szabo@xxxxxxxxx<mailto:istvan.szabo@xxxxxxxxx><mailto:istvan.szabo@xxxxxxxxx><mailto:istvan.szabo@xxxxxxxxx
    <mailto:istvan.szabo@xxxxxxxxx%3cmailto:istvan.szabo@xxxxxxxxx%3e%3cmailto:istvan.szabo@xxxxxxxxx%3e%3cmailto:istvan.szabo@xxxxxxxxx>>
    ---------------------------------------------------

    From: Victor Hooi <victorhooi@xxxxxxxxx
    <mailto:victorhooi@xxxxxxxxx>><mailto:victorhooi@xxxxxxxxx
    <mailto:victorhooi@xxxxxxxxx>>
    Sent: Friday, October 1, 2021 5:30 AM
    To: Eugen Block <eblock@xxxxxx
    <mailto:eblock@xxxxxx>><mailto:eblock@xxxxxx <mailto:eblock@xxxxxx>>
    Cc: Szabo, Istvan (Agoda) <Istvan.Szabo@xxxxxxxxx
    <mailto:Istvan.Szabo@xxxxxxxxx>><mailto:Istvan.Szabo@xxxxxxxxx
    <mailto:Istvan.Szabo@xxxxxxxxx>>; 胡 玮文
    <huww98@xxxxxxxxxxx
    <mailto:huww98@xxxxxxxxxxx>><mailto:huww98@xxxxxxxxxxx
    <mailto:huww98@xxxxxxxxxxx>>; ceph-users <ceph-users@xxxxxxx
    <mailto:ceph-users@xxxxxxx>><mailto:ceph-users@xxxxxxx
    <mailto:ceph-users@xxxxxxx>>
    Subject: Re:  Re: is it possible to remove the db+wal from
    an external device (nvme)

    Email received from the internet. If in doubt, don't click any
    link nor open any attachment !
    ________________________________
    Hi,

    I'm curious - how did you tell that the separate WAL+DB volume was
    slowing things down? I assume you did some benchmarking - is there
    any chance you'd be willing to share results? (Or anybody else
    that's been in a similar situation).

    What sorts of devices are you using for the WAL+DB, versus the
    data disks?

    We're using NAND SSDs, with Optanes for the WAL+DB, and on some
    systems I am seeing slowly than expected behaviour - need to dive
    deeper into it

    In my case, I was running with 4 or 2 OSDs per Optane volume:

    https://www.reddit.com/r/ceph/comments/k2lef1/how_many_waldb_partition
    <https://www.reddit.com/r/ceph/comments/k2lef1/how_many_waldb_partition>
    s_can_you_run_per_optane/

    but I couldn't seem to get the results I'd expected - so curious
    what people are seeing in the real world - and of course, we might
    need to follow the steps here to remove them as well.

    Thanks,
    Victor

    On Thu, 30 Sept 2021 at 16:10, Eugen Block
    <eblock@xxxxxx<mailto:eblock@xxxxxx
    <mailto:eblock@xxxxxx%3cmailto:eblock@xxxxxx>><mailto:eblock@xxxxxx
    <mailto:eblock@xxxxxx>><mailto:eblock@xxxxxx
    <mailto:eblock@xxxxxx>>> wrote:
    Yes, I believe for you it should work without containers although I
    haven't tried the migrate command in a non-containerized cluster yet.
    But I believe this is a general issue for containerized clusters with
    regards to maintenance. I haven't checked yet if there are existing
    tracker issues for this, but maybe this should be worth creating one?


    Zitat von "Szabo, Istvan (Agoda)"
    <Istvan.Szabo@xxxxxxxxx<mailto:Istvan.Szabo@xxxxxxxxx
    <mailto:Istvan.Szabo@xxxxxxxxx%3cmailto:Istvan.Szabo@xxxxxxxxx>><mailto:Istvan.Szabo@xxxxxxxxx
    <mailto:Istvan.Szabo@xxxxxxxxx>><mailto:Istvan.Szabo@xxxxxxxxx
    <mailto:Istvan.Szabo@xxxxxxxxx>>>:

    Actually I don't have containerized deployment, my is normal one. So
    it should work the lvm migrate.

    Istvan Szabo
    Senior Infrastructure Engineer
    ---------------------------------------------------
    Agoda Services Co., Ltd.
    e:
    istvan.szabo@xxxxxxxxx<mailto:istvan.szabo@xxxxxxxxx><mailto:istvan.szabo@xxxxxxxxx><mailto:istvan.szabo@xxxxxxxxx
    <mailto:istvan.szabo@xxxxxxxxx%3cmailto:istvan.szabo@xxxxxxxxx%3e%3cmailto:istvan.szabo@xxxxxxxxx%3e%3cmailto:istvan.szabo@xxxxxxxxx>>
    ---------------------------------------------------

    -----Original Message-----
    From: Eugen Block <eblock@xxxxxx<mailto:eblock@xxxxxx
    <mailto:eblock@xxxxxx%3cmailto:eblock@xxxxxx>><mailto:eblock@xxxxxx
    <mailto:eblock@xxxxxx>><mailto:eblock@xxxxxx <mailto:eblock@xxxxxx>>>
    Sent: Wednesday, September 29, 2021 8:49 PM
    To: 胡 玮文 <huww98@xxxxxxxxxxx<mailto:huww98@xxxxxxxxxxx
    <mailto:huww98@xxxxxxxxxxx%3cmailto:huww98@xxxxxxxxxxx>><mailto:huww98@xxxxxxxxxxx
    <mailto:huww98@xxxxxxxxxxx>><mailto:huww98@xxxxxxxxxxx
    <mailto:huww98@xxxxxxxxxxx>>>
    Cc: Igor Fedotov <ifedotov@xxxxxxx<mailto:ifedotov@xxxxxxx
    <mailto:ifedotov@xxxxxxx%3cmailto:ifedotov@xxxxxxx>><mailto:ifedotov@xxxxxxx
    <mailto:ifedotov@xxxxxxx>><mailto:ifedotov@xxxxxxx
    <mailto:ifedotov@xxxxxxx>>>; Szabo,
    Istvan (Agoda)
    <Istvan.Szabo@xxxxxxxxx<mailto:Istvan.Szabo@xxxxxxxxx
    <mailto:Istvan.Szabo@xxxxxxxxx%3cmailto:Istvan.Szabo@xxxxxxxxx>><mailto:Istvan.Szabo@xxxxxxxxx
    <mailto:Istvan.Szabo@xxxxxxxxx>><mailto:Istvan.Szabo@xxxxxxxxx
    <mailto:Istvan.Szabo@xxxxxxxxx>>>;
    ceph-users@xxxxxxx
    <mailto:ceph-users@xxxxxxx><mailto:ceph-users@xxxxxxx><mailto:ceph-users@xxxxxxx><mailto:ceph-users@xxxxxxx
    <mailto:ceph-users@xxxxxxx%3e%3cmailto:ceph-users@xxxxxxx%3e%3cmailto:ceph-users@xxxxxxx>>
    Subject: Re: is it possible to remove the db+wal from an external
    device (nvme)

    Email received from the internet. If in doubt, don't click any link
    nor open any attachment !
    ________________________________

    That's what I did and pasted the results in my previous comments.


    Zitat von 胡 玮文 <huww98@xxxxxxxxxxx<mailto:huww98@xxxxxxxxxxx
    <mailto:huww98@xxxxxxxxxxx%3cmailto:huww98@xxxxxxxxxxx>><mailto:huww98@xxxxxxxxxxx
    <mailto:huww98@xxxxxxxxxxx>><mailto:huww98@xxxxxxxxxxx
    <mailto:huww98@xxxxxxxxxxx>>>:

    Yes. And “cephadm shell” command does not depend on the running
    daemon, it will start a new container. So I think it is perfectly
    fine to stop the OSD first then run the “cephadm shell” command, and
    run ceph-volume in the new shell.

    发件人: Eugen
    Block<mailto:eblock@xxxxxx<mailto:eblock@xxxxxx><mailto:eblock@xxxxxx
    <mailto:eblock@xxxxxx%3cmailto:eblock@xxxxxx%3e%3cmailto:eblock@xxxxxx>>>
    发送时间: 2021年9月29日 21:40
    收件人: 胡
    玮文<mailto:huww98@xxxxxxxxxxx<mailto:huww98@xxxxxxxxxxx><mailto:huww98@xxxxxxxxxxx
    <mailto:huww98@xxxxxxxxxxx%3cmailto:huww98@xxxxxxxxxxx%3e%3cmailto:huww98@xxxxxxxxxxx>>>
    抄送: Igor
    Fedotov<mailto:ifedotov@xxxxxxx<mailto:ifedotov@xxxxxxx><mailto:ifedotov@xxxxxxx
    <mailto:ifedotov@xxxxxxx%3cmailto:ifedotov@xxxxxxx%3e%3cmailto:ifedotov@xxxxxxx>>>;
    Szabo, Istvan
    (Agoda)<mailto:Istvan.Szabo@xxxxxxxxx<mailto:Istvan.Szabo@xxxxxxxxx
    <mailto:Istvan.Szabo@xxxxxxxxx%3cmailto:Istvan.Szabo@xxxxxxxxx>><mailto:Istvan.Szabo@xxxxxxxxx
    <mailto:Istvan.Szabo@xxxxxxxxx>>
    ;
    ceph-users@xxxxxxx
    <mailto:ceph-users@xxxxxxx><mailto:ceph-users@xxxxxxx><mailto:ceph-users@xxxxxxx><mailto:ceph-users@xxxxxxx
    <mailto:ceph-users@xxxxxxx%3e%3cmailto:ceph-users@xxxxxxx%3e%3cmailto:ceph-users@xxxxxxx>><mailto:ceph-users@ceph
    <mailto:ceph-users@ceph>
    .io<mailto:ceph-users@xxxxxxx><mailto:ceph-users@xxxxxxx
    <mailto:ceph-users@xxxxxxx%3e%3cmailto:ceph-users@xxxxxxx>>>
    主题: Re: is it possible to remove the db+wal from an external device
    (nvme)

    The OSD has to be stopped in order to migrate DB/WAL, it can't be
    done live. ceph-volume requires a lock on the device.


    Zitat von 胡 玮文 <huww98@xxxxxxxxxxx<mailto:huww98@xxxxxxxxxxx
    <mailto:huww98@xxxxxxxxxxx%3cmailto:huww98@xxxxxxxxxxx>><mailto:huww98@xxxxxxxxxxx
    <mailto:huww98@xxxxxxxxxxx>><mailto:huww98@xxxxxxxxxxx
    <mailto:huww98@xxxxxxxxxxx>>>:

    I’ve not tried it, but how about:

    cephadm shell -n osd.0

    then run “ceph-volume” commands in the newly opened shell. The
    directory structure seems fine.

    $ sudo cephadm shell -n osd.0
    Inferring fsid e88d509a-f6fc-11ea-b25d-a0423f3ac864
    Inferring config
    /var/lib/ceph/e88d509a-f6fc-11ea-b25d-a0423f3ac864/osd.0/config
    Using recent ceph image
    cr.example.com/infra/ceph@sha256:8a0f6f285edcd6488e2c91d3f9fa43534d<mailto:cr.example.com/infra/ceph@sha256:8a0f6f285edcd6488e2c91d3f9fa43534d
    <mailto:cr.example.com/infra/ceph@sha256:8a0f6f285edcd6488e2c91d3f9fa43534d%3cmailto:cr.example.com/infra/ceph@sha256:8a0f6f285edcd6488e2c91d3f9fa43534d>>
    37<http://cr.example.com/infra/ceph@sha256:8a0f6f285edcd6488e2c91d3
    f9fa43534d37
    <http://cr.example.com/infra/ceph@sha256:8a0f6f285edcd6488e2c91d3%0bf9fa43534d37>>
    d7a9b37db1e0ff6691aae6466530 root@host0:/# ll
    /var/lib/ceph/osd/ceph-0/ total 68
    drwx------ 2 ceph ceph 4096 Sep 20 04:15 ./
    drwxr-x--- 1 ceph ceph 4096 Sep 29 13:32 ../
    lrwxrwxrwx 1 ceph ceph24 Sep 20 04:15 block ->
    /dev/ceph-hdd/osd.0.data
    lrwxrwxrwx 1 ceph ceph23 Sep 20 04:15 block.db ->
    /dev/ubuntu-vg/osd.0.db
    -rw------- 1 ceph ceph37 Sep 20 04:15 ceph_fsid
    -rw------- 1 ceph ceph387 Jun 21 13:24 config
    -rw------- 1 ceph ceph37 Sep 20 04:15 fsid
    -rw------- 1 ceph ceph55 Sep 20 04:15 keyring
    -rw------- 1 ceph ceph6 Sep 20 04:15 ready
    -rw------- 1 ceph ceph3 Apr2 01:46 require_osd_release
    -rw------- 1 ceph ceph10 Sep 20 04:15 type
    -rw------- 1 ceph ceph38 Sep 17 14:26 unit.configured
    -rw------- 1 ceph ceph48 Nov92020 unit.created
    -rw------- 1 ceph ceph35 Sep 17 14:26 unit.image
    -rw------- 1 ceph ceph306 Sep 17 14:26 unit.meta
    -rw------- 1 ceph ceph 1317 Sep 17 14:26 unit.poststop
    -rw------- 1 ceph ceph 3021 Sep 17 14:26 unit.run
    -rw------- 1 ceph ceph142 Sep 17 14:26 unit.stop
    -rw------- 1 ceph ceph2 Sep 20 04:15 whoami

    发件人: Eugen
    Block<mailto:eblock@xxxxxx<mailto:eblock@xxxxxx><mailto:eblock@xxxxxx
    <mailto:eblock@xxxxxx%3cmailto:eblock@xxxxxx%3e%3cmailto:eblock@xxxxxx>>>
    发送时间: 2021年9月29日 21:29
    收件人: Igor
    Fedotov<mailto:ifedotov@xxxxxxx<mailto:ifedotov@xxxxxxx><mailto:ifedotov@xxxxxxx
    <mailto:ifedotov@xxxxxxx%3cmailto:ifedotov@xxxxxxx%3e%3cmailto:ifedotov@xxxxxxx>>>
    抄送: 胡
    玮文<mailto:huww98@xxxxxxxxxxx<mailto:huww98@xxxxxxxxxxx><mailto:huww98@xxxxxxxxxxx
    <mailto:huww98@xxxxxxxxxxx%3cmailto:huww98@xxxxxxxxxxx%3e%3cmailto:huww98@xxxxxxxxxxx>>>;
    Szabo, Istvan
    (Agoda)<mailto:Istvan.Szabo@xxxxxxxxx
    <mailto:Istvan.Szabo@xxxxxxxxx><mailto:Istvan.Szabo@xxxxxxxxx
    <mailto:Istvan.Szabo@xxxxxxxxx>
    ;
    ceph-users@xxxxxxx
    <mailto:ceph-users@xxxxxxx><mailto:ceph-users@xxxxxxx><mailto:ceph-users@xxxxxxx><mailto:ceph-users@xxxxxxx
    <mailto:ceph-users@xxxxxxx%3e%3cmailto:ceph-users@xxxxxxx%3e%3cmailto:ceph-users@xxxxxxx>><mailto:ceph-users@cep
    <mailto:ceph-users@cep>
    h.io<mailto:ceph-users@xxxxxxx><mailto:ceph-users@xxxxxxx
    <mailto:ceph-users@xxxxxxx%3e%3cmailto:ceph-users@xxxxxxx>>>
    主题: Re:  Re: 回复: [ceph-users] Re: is it possible to
    remove the db+wal from an external device (nvme)

    Hi Igor,

    thanks for your input. I haven't done this in a prod env yet
    either, still playing around in a virtual lab env.
    I tried the symlink suggestion but it's not that easy, because it
    looks different underneath the ceph directory than ceph-volume
    expects it. These are the services underneath:

    ses7-host1:~ # ll
    /var/lib/ceph/152fd738-01bc-11ec-a7fd-fa163e672db2/
    insgesamt 48
    drwx------ 3 rootroot4096 16. Sep 16:11 alertmanager.ses7-host1
    drwx------ 3 cephceph4096 29. Sep 09:03 crash
    drwx------ 2 cephceph4096 16. Sep 16:39 crash.ses7-host1
    drwx------ 4 messagebus lp4096 16. Sep 16:23 grafana.ses7-host1
    drw-rw---- 2 rootroot4096 24. Aug 10:00 home
    drwx------ 2 cephceph4096 16. Sep 16:37 mgr.ses7-host1.wmgyit
    drwx------ 3 cephceph4096 16. Sep 16:37 mon.ses7-host1
    drwx------ 2 nobodynobody 4096 16. Sep 16:37 node-exporter.ses7-host1
    drwx------ 2 cephceph4096 29. Sep 08:43 osd.0
    drwx------ 2 cephceph4096 29. Sep 15:11 osd.1
    drwx------ 4 rootroot4096 16. Sep 16:12 prometheus.ses7-host1


    While the directory in a non-containerized deployment looks like this:

    nautilus:~ # ll /var/lib/ceph/osd/ceph-0/ insgesamt 24 lrwxrwxrwx 1
    ceph ceph 93 29. Sep 12:21 block ->
    /dev/ceph-a6d78a29-637f-494b-a839-76251fcff67e/osd-block-39340a48-5
    4b
    3-4689-9896-f54d005c535d
    -rw------- 1 ceph ceph 37 29. Sep 12:21 ceph_fsid
    -rw------- 1 ceph ceph 37 29. Sep 12:21 fsid
    -rw------- 1 ceph ceph 55 29. Sep 12:21 keyring
    -rw------- 1 ceph ceph6 29. Sep 12:21 ready
    -rw------- 1 ceph ceph 10 29. Sep 12:21 type
    -rw------- 1 ceph ceph2 29. Sep 12:21 whoami


    But even if I create the symlink to the osd directory it fails
    because I only have ceph-volume within the containers where the
    symlink is not visible to cephadm.


    ses7-host1:~ # ll /var/lib/ceph/osd/ceph-1 lrwxrwxrwx 1 root root
    57 29. Sep 15:08 /var/lib/ceph/osd/ceph-1 ->
    /var/lib/ceph/152fd738-01bc-11ec-a7fd-fa163e672db2/osd.1/

    ses7-host1:~ # cephadm ceph-volume lvm migrate --osd-id 1
    --osd-fsid
    b4c772aa-07f8-483d-ae58-0ab97b8d0cc4 --from db --target
    ceph-b1ddff4b-95e8-4b91-b451-a3ea35d16ec0/osd-block-b4c772aa-07f8-4
    83
    d-ae58-0ab97b8d0cc4 Inferring fsid
    152fd738-01bc-11ec-a7fd-fa163e672db2
    [...]
    /usr/bin/podman: stderr --> Migrate to existing, Source:
    ['--devs-source', '/var/lib/ceph/osd/ceph-1/block.db'] Target:
    /var/lib/ceph/osd/ceph-1/block
    /usr/bin/podman: stderrstdout: inferring bluefs devices from
    bluestore path
    /usr/bin/podman: stderrstderr: can't migrate
    /var/lib/ceph/osd/ceph-1/block.db, not a valid bluefs volume
    /usr/bin/podman: stderr --> Failed to migrate device, error code:1
    /usr/bin/podman: stderr --> Undoing lv tag set
    /usr/bin/podman: stderr Failed to migrate to :
    ceph-b1ddff4b-95e8-4b91-b451-a3ea35d16ec0/osd-block-b4c772aa-07f8-4
    83
    d-ae58-0ab97b8d0cc4
    Traceback (most recent call last):
    File "/usr/sbin/cephadm", line 6225, in <module>
    r = args.func()
    File "/usr/sbin/cephadm", line 1363, in _infer_fsid
    return func()
    File "/usr/sbin/cephadm", line 1422, in _infer_image
    return func()
    File "/usr/sbin/cephadm", line 3687, in command_ceph_volume
    out, err, code = call_throws(c.run_cmd(),
    verbosity=CallVerbosity.VERBOSE)
    File "/usr/sbin/cephadm", line 1101, in call_throws
    raise RuntimeError('Failed command: %s' % ' '.join(command))
    [...]


    I could install the package ceph-osd (where ceph-volume is packaged
    in) but it's not available by default (as you see this is a SES 7
    environment).

    I'm not sure what the design is here, it feels like the ceph-volume
    migrate command is not applicable to containers yet.

    Regards,
    Eugen


    Zitat von Igor Fedotov <ifedotov@xxxxxxx<mailto:ifedotov@xxxxxxx
    <mailto:ifedotov@xxxxxxx%3cmailto:ifedotov@xxxxxxx>><mailto:ifedotov@xxxxxxx
    <mailto:ifedotov@xxxxxxx>><mailto:ifedotov@xxxxxxx
    <mailto:ifedotov@xxxxxxx>>>:

    Hi Eugen,

    indeed this looks like an issue related to containerized
    deployment, "ceph-volume lvm migrate" expects osd folder to be
    under
    /var/lib/ceph/osd:

    stderr: 2021-09-29T06:56:24.787+0000 7fde05b96180 -1
    bluestore(/var/lib/ceph/osd/ceph-1) _lock_fsid failed to lock
    /var/lib/ceph/osd/ceph-1/fsid (is another ceph-osd still
    running?)(11) Resource temporarily unavailable
    As a workaround you might want to try to create a symlink to your
    actual location before issuing the migrate command:
    /var/lib/ceph/osd ->
    /var/lib/ceph/152fd738-01bc-11ec-a7fd-fa163e672db2/

    More complicated (and more general IMO) way would be to run the
    migrate command from within a container deployed similarly (i.e.
    with all the proper subfolder mappings) to ceph-osd one. Just
    speculating - not a big expert in containers and never tried that
    with properly deployed production cluster...


    Thanks,

    Igor

    On 9/29/2021 10:07 AM, Eugen Block wrote:
    Hi,

    I just tried with 'ceph-volume lvm migrate' in Octopus but it
    doesn't really work. I'm not sure if I'm missing something here,
    but I believe it's again the already discussed containers issue.
    To be able to run the command for an OSD the OSD has to be
    offline, but then you don't have access to the block.db because
    the path is different from outside the container:

    ---snip---
    [ceph: root@host1 /]# ceph-volume lvm migrate --osd-id 1
    --osd-fsid
    b4c772aa-07f8-483d-ae58-0ab97b8d0cc4 --from db --target
    ceph-b1ddff4b-95e8-4b91-b451-a3ea35d16ec0/osd-block-b4c772aa-07f8
    -4
    83d-ae58-0ab97b8d0cc4 --> Migrate to existing, Source:
    ['--devs-source', '/var/lib/ceph/osd/ceph-1/block.db']
    Target:
    /var/lib/ceph/osd/ceph-1/block
    stdout: inferring bluefs devices from bluestore path
    stderr:
    /home/abuild/rpmbuild/BUILD/ceph-15.2.14-84-gb6e5642e260/src/os/b
    lu
    estore/BlueStore.cc: In function 'int
    BlueStore::_mount_for_bluefs()' thread
    7fde05b96180
    time
    2021-09-29T06:56:24.790161+0000
    stderr:
    /home/abuild/rpmbuild/BUILD/ceph-15.2.14-84-gb6e5642e260/src/os/b
    lu
    estore/BlueStore.cc: 6876: FAILED ceph_assert(r ==
    0)
    stderr: 2021-09-29T06:56:24.787+0000 7fde05b96180 -1
    bluestore(/var/lib/ceph/osd/ceph-1) _lock_fsid failed to lock
    /var/lib/ceph/osd/ceph-1/fsid (is another ceph-osd still
    running?)(11) Resource temporarily unavailable


    # path outside
    host1:~ # ll
    /var/lib/ceph/152fd738-01bc-11ec-a7fd-fa163e672db2/osd.1/
    insgesamt 60
    lrwxrwxrwx 1 ceph ceph93 29. Sep 08:43 block ->
    /dev/ceph-b1ddff4b-95e8-4b91-b451-a3ea35d16ec0/osd-block-b4c772aa-07f8-483d-ae58-0ab97b8d0cc4
    lrwxrwxrwx 1 ceph ceph90 29. Sep 08:43 block.db ->
    /dev/ceph-6f1b8f49-daf2-4631-a2ef-12e9452b01ea/osd-db-69b11aa0-af
    96
    -443e-8f03-5afa5272131f
    ---snip---


    But if I shutdown the OSD I can't access the block and block.db
    devices. I'm not even sure how this is supposed to work with
    cephadm. Maybe I'm misunderstanding, though. Or is there a way to
    provide the offline block.db path to 'ceph-volume lvm migrate'?



    Zitat von 胡 玮文 <huww98@xxxxxxxxxxx<mailto:huww98@xxxxxxxxxxx
    <mailto:huww98@xxxxxxxxxxx%3cmailto:huww98@xxxxxxxxxxx>><mailto:huww98@xxxxxxxxxxx
    <mailto:huww98@xxxxxxxxxxx>><mailto:huww98@xxxxxxxxxxx
    <mailto:huww98@xxxxxxxxxxx>>>:

    You may need to use `ceph-volume lvm migrate’ [1] instead of
    ceph-bluestore-tool. If I recall correctly, this is a pretty new
    feature, I’m not sure whether it is available to your version.

    If you use ceph-bluestore-tool, then you need to modify the LVM
    tags manually. Please refer to the previous threads, e.g. [2]
    and some more.

    [1]: https://docs.ceph.com/en/latest/man/8/ceph-volume/#migrate
    <https://docs.ceph.com/en/latest/man/8/ceph-volume/#migrate>
    [2]:
    https://lists.ceph.io/hyperkitty/list/ceph-users@xxxxxxx/thread/
    <https://lists.ceph.io/hyperkitty/list/ceph-users@xxxxxxx/thread/>
    VX
    23NQ66P3PPEX36T3PYYMHPLBSFLMYA/#JLNDFGXR4ZLY27DHD3RJTTZEDHRZJO4Q

    发件人: Szabo, Istvan
    (Agoda)<mailto:Istvan.Szabo@xxxxxxxxx<mailto:Istvan.Szabo@agoda.
    com
    <mailto:Istvan.Szabo@xxxxxxxxx%3cmailto:Istvan.Szabo@agoda.%0bcom>>>
    发送时间: 2021年9月28日 18:20
    收件人: Eugen
    Block<mailto:eblock@xxxxxx<mailto:eblock@xxxxxx><mailto:eblock@xxxxxx
    <mailto:eblock@xxxxxx%3cmailto:eblock@xxxxxx%3e%3cmailto:eblock@xxxxxx>>>;
    ceph-users@xxxxxxx
    <mailto:ceph-users@xxxxxxx><mailto:ceph-users@xxxxxxx><mailto:ceph-users@xxxxxxx><mailto:ceph-users@xxxxxxx
    <mailto:ceph-users@xxxxxxx%3e%3cmailto:ceph-users@xxxxxxx%3e%3cmailto:ceph-users@xxxxxxx>><mailto:ceph-users@
    <mailto:ceph-users@>
    ceph.io<mailto:ceph-users@xxxxxxx><mailto:ceph-users@xxxxxxx
    <mailto:ceph-users@xxxxxxx%3e%3cmailto:ceph-users@xxxxxxx>>>
    主题:  Re: is it possible to remove the db+wal from an
    external device (nvme)

    Gave a try of it, so all the 3 osds finally failed :/ Not sure
    what went wrong.

    Do the normal maintenance things, ceph osd set noout, ceph osd
    set norebalance, stop the osd and run this command:
    ceph-bluestore-tool bluefs-bdev-migrate --dev-target
    /var/lib/ceph/osd/ceph-0/block --devs-source
    /var/lib/ceph/osd/ceph-8/block.db --path
    /var/lib/ceph/osd/ceph-8/
    Output:
    device removed:1 /var/lib/ceph/osd/ceph-8/block.db device added:
    1
    /dev/dm-2

    When tried to start I got this in the log:
    osd.8 0 OSD:init: unable to mount object store
    ** ERROR: osd init failed: (13) Permission denied set uid:gid
    to
    167:167 (ceph:ceph) ceph version 15.2.13
    (c44bc49e7a57a87d84dfff2a077a2058aa2172e2)
    octopus (stable), process ceph-osd, pid 1512261
    pidfile_write: ignore empty --pid-file

    From the another 2 osds the block.db removed and I can start it back.
    I've zapped the db drive just to be removed from the device
    completely and after machine restart none of these 2 osds came
    back, I guess missing the db device.

    Is there any steps missing?
    1.Noout+norebalance
    2. Stop osd
    3. migrate with the above command the block.db to the block.
    4. do on the other osds which is sharing the same db device that
    want to remove.
    5. zap the db device
    6. start back the osds.

    Istvan Szabo
    Senior Infrastructure Engineer
    ---------------------------------------------------
    Agoda Services Co., Ltd.
    e:
    istvan.szabo@xxxxxxxxx<mailto:istvan.szabo@xxxxxxxxx><mailto:istvan.szabo@xxxxxxxxx><mailto:istvan.szabo@xxxxxxxxx
    <mailto:istvan.szabo@xxxxxxxxx%3cmailto:istvan.szabo@xxxxxxxxx%3e%3cmailto:istvan.szabo@xxxxxxxxx%3e%3cmailto:istvan.szabo@xxxxxxxxx>>
    ---------------------------------------------------

    -----Original Message-----
    From: Eugen Block <eblock@xxxxxx<mailto:eblock@xxxxxx
    <mailto:eblock@xxxxxx%3cmailto:eblock@xxxxxx>><mailto:eblock@xxxxxx
    <mailto:eblock@xxxxxx>><mailto:eblock@xxxxxx <mailto:eblock@xxxxxx>>>
    Sent: Monday, September 27, 2021 7:42 PM
    To: ceph-users@xxxxxxx
    <mailto:ceph-users@xxxxxxx><mailto:ceph-users@xxxxxxx><mailto:ceph-users@xxxxxxx><mailto:ceph-users@xxxxxxx
    <mailto:ceph-users@xxxxxxx%3e%3cmailto:ceph-users@xxxxxxx%3e%3cmailto:ceph-users@xxxxxxx>>
    Subject:  Re: is it possible to remove the db+wal
    from an external device (nvme)

    Email received from the internet. If in doubt, don't click any
    link nor open any attachment !
    ________________________________

    Hi,

    I think 'ceph-bluestore-tool bluefs-bdev-migrate' could be of
    use here. I haven't tried it in a production environment yet,
    only in virtual labs.

    Regards,
    Eugen


    Zitat von "Szabo, Istvan (Agoda)"
    <Istvan.Szabo@xxxxxxxxx<mailto:Istvan.Szabo@xxxxxxxxx
    <mailto:Istvan.Szabo@xxxxxxxxx%3cmailto:Istvan.Szabo@xxxxxxxxx>><mailto:Istvan.Szabo@xxxxxxxxx
    <mailto:Istvan.Szabo@xxxxxxxxx>><mailto:Istvan.Szabo@xxxxxxxxx
    <mailto:Istvan.Szabo@xxxxxxxxx>>>:

    Hi,

    Seems like in our config the nvme deviceas a wal+db in front
    of the ssd slowing down the ssds osds.
    I'd like to avoid to rebuild all the osd-, is there a way
    somehow migrate to the "slower device" the wal+db without reinstall?

    Ty
    _______________________________________________
    ceph-users mailing list --
    ceph-users@xxxxxxx
    <mailto:ceph-users@xxxxxxx><mailto:ceph-users@xxxxxxx><mailto:ceph-users@xxxxxxx><mailto:ceph-users@xxxxxxx
    <mailto:ceph-users@xxxxxxx%3e%3cmailto:ceph-users@xxxxxxx%3e%3cmailto:ceph-users@xxxxxxx>>
    To unsubscribe
    send an email to
    ceph-users-leave@xxxxxxx<mailto:ceph-users-leave@xxxxxxx><mailto:ceph-users-leave@xxxxxxx><mailto:ceph-users-leave@xxxxxxx
    <mailto:ceph-users-leave@xxxxxxx%3cmailto:ceph-users-leave@xxxxxxx%3e%3cmailto:ceph-users-leave@xxxxxxx%3e%3cmailto:ceph-users-leave@xxxxxxx>>


    _______________________________________________
    ceph-users mailing list --
    ceph-users@xxxxxxx
    <mailto:ceph-users@xxxxxxx><mailto:ceph-users@xxxxxxx><mailto:ceph-users@xxxxxxx><mailto:ceph-users@xxxxxxx
    <mailto:ceph-users@xxxxxxx%3e%3cmailto:ceph-users@xxxxxxx%3e%3cmailto:ceph-users@xxxxxxx>>
    To unsubscribe
    send an email to
    ceph-users-leave@xxxxxxx<mailto:ceph-users-leave@xxxxxxx><mailto:ceph-users-leave@xxxxxxx><mailto:ceph-users-leave@xxxxxxx
    <mailto:ceph-users-leave@xxxxxxx%3cmailto:ceph-users-leave@xxxxxxx%3e%3cmailto:ceph-users-leave@xxxxxxx%3e%3cmailto:ceph-users-leave@xxxxxxx>>
    _______________________________________________
    ceph-users mailing list --
    ceph-users@xxxxxxx
    <mailto:ceph-users@xxxxxxx><mailto:ceph-users@xxxxxxx><mailto:ceph-users@xxxxxxx><mailto:ceph-users@xxxxxxx
    <mailto:ceph-users@xxxxxxx%3e%3cmailto:ceph-users@xxxxxxx%3e%3cmailto:ceph-users@xxxxxxx>>
    To unsubscribe
    send an email to
    ceph-users-leave@xxxxxxx<mailto:ceph-users-leave@xxxxxxx><mailto:ceph-users-leave@xxxxxxx><mailto:ceph-users-leave@xxxxxxx
    <mailto:ceph-users-leave@xxxxxxx%3cmailto:ceph-users-leave@xxxxxxx%3e%3cmailto:ceph-users-leave@xxxxxxx%3e%3cmailto:ceph-users-leave@xxxxxxx>>


    _______________________________________________
    ceph-users mailing list --
    ceph-users@xxxxxxx
    <mailto:ceph-users@xxxxxxx><mailto:ceph-users@xxxxxxx><mailto:ceph-users@xxxxxxx><mailto:ceph-users@xxxxxxx
    <mailto:ceph-users@xxxxxxx%3e%3cmailto:ceph-users@xxxxxxx%3e%3cmailto:ceph-users@xxxxxxx>>
    To unsubscribe send
    an email to
    ceph-users-leave@xxxxxxx<mailto:ceph-users-leave@xxxxxxx><mailto:ceph-users-leave@xxxxxxx><mailto:ceph-users-leave@xxxxxxx
    <mailto:ceph-users-leave@xxxxxxx%3cmailto:ceph-users-leave@xxxxxxx%3e%3cmailto:ceph-users-leave@xxxxxxx%3e%3cmailto:ceph-users-leave@xxxxxxx>>


    _______________________________________________
    ceph-users mailing list --
    ceph-users@xxxxxxx
    <mailto:ceph-users@xxxxxxx><mailto:ceph-users@xxxxxxx><mailto:ceph-users@xxxxxxx><mailto:ceph-users@xxxxxxx
    <mailto:ceph-users@xxxxxxx%3e%3cmailto:ceph-users@xxxxxxx%3e%3cmailto:ceph-users@xxxxxxx>>
    To unsubscribe send an email to
    ceph-users-leave@xxxxxxx<mailto:ceph-users-leave@xxxxxxx><mailto:ceph-users-leave@xxxxxxx><mailto:ceph-users-leave@xxxxxxx
    <mailto:ceph-users-leave@xxxxxxx%3cmailto:ceph-users-leave@xxxxxxx%3e%3cmailto:ceph-users-leave@xxxxxxx%3e%3cmailto:ceph-users-leave@xxxxxxx>>
    _______________________________________________
    ceph-users mailing list -- ceph-users@xxxxxxx
    <mailto:ceph-users@xxxxxxx><mailto:ceph-users@xxxxxxx
    <mailto:ceph-users@xxxxxxx>> To unsubscribe send an
    email to ceph-users-leave@xxxxxxx<mailto:ceph-users-leave@xxxxxxx
    <mailto:ceph-users-leave@xxxxxxx%3cmailto:ceph-users-leave@xxxxxxx>>
    _______________________________________________
    ceph-users mailing list -- ceph-users@xxxxxxx
    <mailto:ceph-users@xxxxxxx><mailto:ceph-users@xxxxxxx
    <mailto:ceph-users@xxxxxxx>> To unsubscribe send an email to
    ceph-users-leave@xxxxxxx<mailto:ceph-users-leave@xxxxxxx
    <mailto:ceph-users-leave@xxxxxxx%3cmailto:ceph-users-leave@xxxxxxx>>
    _______________________________________________
    ceph-users mailing list -- ceph-users@xxxxxxx
    <mailto:ceph-users@xxxxxxx><mailto:ceph-users@xxxxxxx
    <mailto:ceph-users@xxxxxxx>>
    To unsubscribe send an email to
    ceph-users-leave@xxxxxxx<mailto:ceph-users-leave@xxxxxxx
    <mailto:ceph-users-leave@xxxxxxx%3cmailto:ceph-users-leave@xxxxxxx>>
    _______________________________________________
    ceph-users mailing list -- ceph-users@xxxxxxx
    <mailto:ceph-users@xxxxxxx>
    To unsubscribe send an email to ceph-users-leave@xxxxxxx
    <mailto:ceph-users-leave@xxxxxxx>

_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx





[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux