This one is in messages: https://justpaste.it/3x08z Buffered_io is turned on by default in 15.2.14 octopus FYI. Istvan Szabo Senior Infrastructure Engineer --------------------------------------------------- Agoda Services Co., Ltd. e: istvan.szabo@xxxxxxxxx --------------------------------------------------- -----Original Message----- From: Eugen Block <eblock@xxxxxx> Sent: Tuesday, October 5, 2021 9:52 PM To: Szabo, Istvan (Agoda) <Istvan.Szabo@xxxxxxxxx> Cc: 胡 玮文 <huww98@xxxxxxxxxxx>; Igor Fedotov <ifedotov@xxxxxxx>; ceph-users@xxxxxxx Subject: Re: Re: is it possible to remove the db+wal from an external device (nvme) Email received from the internet. If in doubt, don't click any link nor open any attachment ! ________________________________ Do you see oom killers in dmesg on this host? This line indicates it: "(tcmalloc::allocate_full_cpp_throw_oom(unsigned long)+0x146) [0x7f310b7d8c96]", Zitat von "Szabo, Istvan (Agoda)" <Istvan.Szabo@xxxxxxxxx>: > Hmm, tried another one which hasn’t been spilledover disk, still > coredumped ☹ Is there any special thing that we need to do before we > migrate db next to the block? Our osds are using dmcrypt, is it an issue? > > { > "backtrace": [ > "(()+0x12b20) [0x7f310aa49b20]", > "(gsignal()+0x10f) [0x7f31096aa37f]", > "(abort()+0x127) [0x7f3109694db5]", > "(()+0x9009b) [0x7f310a06209b]", > "(()+0x9653c) [0x7f310a06853c]", > "(()+0x95559) [0x7f310a067559]", > "(__gxx_personality_v0()+0x2a8) [0x7f310a067ed8]", > "(()+0x10b03) [0x7f3109a48b03]", > "(_Unwind_RaiseException()+0x2b1) [0x7f3109a49071]", > "(__cxa_throw()+0x3b) [0x7f310a0687eb]", > "(()+0x19fa4) [0x7f310b7b6fa4]", > "(tcmalloc::allocate_full_cpp_throw_oom(unsigned > long)+0x146) [0x7f310b7d8c96]", > "(()+0x10d0f8e) [0x55ffa520df8e]", > "(rocksdb::Version::~Version()+0x104) [0x55ffa521d174]", > "(rocksdb::Version::Unref()+0x21) [0x55ffa521d221]", > "(rocksdb::ColumnFamilyData::~ColumnFamilyData()+0x5a) > [0x55ffa52efcca]", > "(rocksdb::ColumnFamilySet::~ColumnFamilySet()+0x88) > [0x55ffa52f0568]", > "(rocksdb::VersionSet::~VersionSet()+0x5e) [0x55ffa520e01e]", > "(rocksdb::VersionSet::~VersionSet()+0x11) [0x55ffa520e261]", > "(rocksdb::DBImpl::CloseHelper()+0x616) [0x55ffa5155ed6]", > "(rocksdb::DBImpl::~DBImpl()+0x83b) [0x55ffa515c35b]", > "(rocksdb::DBImplReadOnly::~DBImplReadOnly()+0x11) [0x55ffa51a3bc1]", > "(rocksdb::DB::OpenForReadOnly(rocksdb::DBOptions const&, > std::__cxx11::basic_string<char, std::char_traits<char>, > std::allocator<char> > const&, > std::vector<rocksdb::ColumnFamilyDescriptor, > std::allocator<rocksdb::ColumnFamilyDescriptor> > const&, > std::vector<rocksdb::ColumnFamilyHandle*, > std::allocator<rocksdb::ColumnFamilyHandle*> >*, rocksdb::DB**, > bool)+0x1089) [0x55ffa51a57e9]", > "(RocksDBStore::do_open(std::ostream&, bool, bool, > std::vector<KeyValueDB::ColumnFamily, > std::allocator<KeyValueDB::ColumnFamily> > const*)+0x14ca) > [0x55ffa51285ca]", > "(BlueStore::_open_db(bool, bool, bool)+0x1314) [0x55ffa4bc27e4]", > "(BlueStore::_open_db_and_around(bool)+0x4c) [0x55ffa4bd4c5c]", > "(BlueStore::_mount(bool, bool)+0x847) [0x55ffa4c2e047]", > "(OSD::init()+0x380) [0x55ffa4753a70]", > "(main()+0x47f1) [0x55ffa46a6901]", > "(__libc_start_main()+0xf3) [0x7f3109696493]", > "(_start()+0x2e) [0x55ffa46d4e3e]" > ], > "ceph_version": "15.2.14", > "crash_id": > "2021-10-05T13:31:28.513463Z_b6818598-4960-4ed6-942a-d4a7ff37a758", > "entity_name": "osd.48", > "os_id": "centos", > "os_name": "CentOS Linux", > "os_version": "8", > "os_version_id": "8", > "process_name": "ceph-osd", > "stack_sig": > "6a43b6c219adac393b239fbea4a53ff87c4185bcd213724f0d721b452b81ddbf", > "timestamp": "2021-10-05T13:31:28.513463Z", > "utsname_hostname": "server-2s07", > "utsname_machine": "x86_64", > "utsname_release": "4.18.0-305.19.1.el8_4.x86_64", > "utsname_sysname": "Linux", > "utsname_version": "#1 SMP Wed Sep 15 15:39:39 UTC 2021" > } > Istvan Szabo > Senior Infrastructure Engineer > --------------------------------------------------- > Agoda Services Co., Ltd. > e: istvan.szabo@xxxxxxxxx<mailto:istvan.szabo@xxxxxxxxx> > --------------------------------------------------- > > From: 胡 玮文 <huww98@xxxxxxxxxxx> > Sent: Monday, October 4, 2021 12:13 AM > To: Szabo, Istvan (Agoda) <Istvan.Szabo@xxxxxxxxx>; Igor Fedotov > <ifedotov@xxxxxxx> > Cc: ceph-users@xxxxxxx > Subject: 回复: Re: is it possible to remove the db+wal from > an external device (nvme) > > Email received from the internet. If in doubt, don't click any link > nor open any attachment ! > ________________________________ > The stack trace (tcmalloc::allocate_full_cpp_throw_oom) seems > indicating you don’t have enough memory. > > 发件人: Szabo, Istvan (Agoda)<mailto:Istvan.Szabo@xxxxxxxxx> > 发送时间: 2021年10月4日 0:46 > 收件人: Igor Fedotov<mailto:ifedotov@xxxxxxx> > 抄送: ceph-users@xxxxxxx<mailto:ceph-users@xxxxxxx> > 主题: Re: is it possible to remove the db+wal from an > external device (nvme) > > Seems like it cannot start anymore once migrated ☹ > > https://justpaste.it/5hkot > > Istvan Szabo > Senior Infrastructure Engineer > --------------------------------------------------- > Agoda Services Co., Ltd. > e: > istvan.szabo@xxxxxxxxx<mailto:istvan.szabo@xxxxxxxxx<mailto:istvan.sza > bo@xxxxxxxxx%3cmailto:istvan.szabo@xxxxxxxxx>> > --------------------------------------------------- > > From: Igor Fedotov <ifedotov@xxxxxxx<mailto:ifedotov@xxxxxxx>> > Sent: Saturday, October 2, 2021 5:22 AM > To: Szabo, Istvan (Agoda) > <Istvan.Szabo@xxxxxxxxx<mailto:Istvan.Szabo@xxxxxxxxx>> > Cc: ceph-users@xxxxxxx<mailto:ceph-users@xxxxxxx>; Eugen Block > <eblock@xxxxxx<mailto:eblock@xxxxxx>>; Christian Wuerdig > <christian.wuerdig@xxxxxxxxx<mailto:christian.wuerdig@xxxxxxxxx>> > Subject: Re: Re: is it possible to remove the db+wal from > an external device (nvme) > > Email received from the internet. If in doubt, don't click any link > nor open any attachment ! > ________________________________ > > Hi Istvan, > > yeah both db and wal to slow migration are supported. And spillover > state isn't a show stopper for that. > > > On 10/2/2021 1:16 AM, Szabo, Istvan (Agoda) wrote: > Dear Igor, > > Is the ceph-volume lvm migrate command smart enough in octopus > 15.2.14 to be able to remove the db (included the wall) from the nvme > even if it is spilledover? I can’t compact back to normal many disk to > not show spillover warning. > > I think Christian has the truth of the issue, my Nvme with 30k rand > write iops backing 3x ssd with 67k rand write iops each … > > Istvan Szabo > Senior Infrastructure Engineer > --------------------------------------------------- > Agoda Services Co., Ltd. > e: > istvan.szabo@xxxxxxxxx<mailto:istvan.szabo@xxxxxxxxx<mailto:istvan.sza > bo@xxxxxxxxx%3cmailto:istvan.szabo@xxxxxxxxx>> > --------------------------------------------------- > > > On 2021. Oct 1., at 11:47, Szabo, Istvan (Agoda) > <Istvan.Szabo@xxxxxxxxx<mailto:Istvan.Szabo@xxxxxxxxx>><mailto:Istvan. > Szabo@xxxxxxxxx> > wrote: > 3x SSD osd /nvme > > Istvan Szabo > Senior Infrastructure Engineer > --------------------------------------------------- > Agoda Services Co., Ltd. > e: > istvan.szabo@xxxxxxxxx<mailto:istvan.szabo@xxxxxxxxx<mailto:istvan.sza > bo@xxxxxxxxx%3cmailto:istvan.szabo@xxxxxxxxx>> > --------------------------------------------------- > > -----Original Message----- > From: Igor Fedotov > <ifedotov@xxxxxxx<mailto:ifedotov@xxxxxxx>><mailto:ifedotov@xxxxxxx> > Sent: Friday, October 1, 2021 4:35 PM > To: ceph-users@xxxxxxx<mailto:ceph-users@xxxxxxx> > Subject: Re: is it possible to remove the db+wal from an > external device (nvme) > > Email received from the internet. If in doubt, don't click any link > nor open any attachment ! > ________________________________ > > And how many OSDs are per single NVMe do you have? > > On 10/1/2021 9:55 AM, Szabo, Istvan (Agoda) wrote: > > I have my dashboards and I can see that the db nvmes are always > running on 100% utilization (you can monitor with iostat -x 1) and it > generates all the time iowaits which is between 1-3. > > I’m using nvme in front of the ssds. > > Istvan Szabo > Senior Infrastructure Engineer > --------------------------------------------------- > Agoda Services Co., Ltd. > e: > istvan.szabo@xxxxxxxxx<mailto:istvan.szabo@xxxxxxxxx><mailto:istvan.sz > abo@xxxxxxxxx><mailto:istvan.szabo@xxxxxxxxx<mailto:istvan.szabo@agoda > .com%3cmailto:istvan.szabo@xxxxxxxxx%3e%3cmailto:istvan.szabo@xxxxxxxx > m%3e%3cmailto:istvan.szabo@xxxxxxxxx>> > --------------------------------------------------- > > From: Victor Hooi > <victorhooi@xxxxxxxxx<mailto:victorhooi@xxxxxxxxx>><mailto:victorhooi@ > yahoo.com> > Sent: Friday, October 1, 2021 5:30 AM > To: Eugen Block > <eblock@xxxxxx<mailto:eblock@xxxxxx>><mailto:eblock@xxxxxx> > Cc: Szabo, Istvan (Agoda) > <Istvan.Szabo@xxxxxxxxx<mailto:Istvan.Szabo@xxxxxxxxx>><mailto:Istvan. > Szabo@xxxxxxxxx>; 胡 > 玮文 > <huww98@xxxxxxxxxxx<mailto:huww98@xxxxxxxxxxx>><mailto:huww98@outlook. > com>; > ceph-users > <ceph-users@xxxxxxx<mailto:ceph-users@xxxxxxx>><mailto:ceph-users@ceph > .io> > Subject: Re: Re: is it possible to remove the db+wal from > an external device (nvme) > > Email received from the internet. If in doubt, don't click any link > nor open any attachment ! > ________________________________ > Hi, > > I'm curious - how did you tell that the separate WAL+DB volume was > slowing things down? I assume you did some benchmarking - is there any > chance you'd be willing to share results? (Or anybody else that's been > in a similar situation). > > What sorts of devices are you using for the WAL+DB, versus the data disks? > > We're using NAND SSDs, with Optanes for the WAL+DB, and on some > systems I am seeing slowly than expected behaviour - need to dive > deeper into it > > In my case, I was running with 4 or 2 OSDs per Optane volume: > > https://www.reddit.com/r/ceph/comments/k2lef1/how_many_waldb_partition > s_can_you_run_per_optane/ > > but I couldn't seem to get the results I'd expected - so curious what > people are seeing in the real world - and of course, we might need to > follow the steps here to remove them as well. > > Thanks, > Victor > > On Thu, 30 Sept 2021 at 16:10, Eugen Block > <eblock@xxxxxx<mailto:eblock@xxxxxx<mailto:eblock@xxxxxx%3cmailto:eblo > ck@xxxxxx>><mailto:eblock@xxxxxx><mailto:eblock@xxxxxx>> > wrote: > Yes, I believe for you it should work without containers although I > haven't tried the migrate command in a non-containerized cluster yet. > But I believe this is a general issue for containerized clusters with > regards to maintenance. I haven't checked yet if there are existing > tracker issues for this, but maybe this should be worth creating one? > > > Zitat von "Szabo, Istvan (Agoda)" > <Istvan.Szabo@xxxxxxxxx<mailto:Istvan.Szabo@xxxxxxxxx<mailto:Istvan.Szabo@xxxxxxxxx%3cmailto:Istvan.Szabo@xxxxxxxxx>><mailto:Istvan.Szabo@xxxxxxxxx><mailto:Istvan.Szabo@xxxxxxxxx>>: > > Actually I don't have containerized deployment, my is normal one. So > it should work the lvm migrate. > > Istvan Szabo > Senior Infrastructure Engineer > --------------------------------------------------- > Agoda Services Co., Ltd. > e: > istvan.szabo@xxxxxxxxx<mailto:istvan.szabo@xxxxxxxxx><mailto:istvan.sz > abo@xxxxxxxxx><mailto:istvan.szabo@xxxxxxxxx<mailto:istvan.szabo@agoda > .com%3cmailto:istvan.szabo@xxxxxxxxx%3e%3cmailto:istvan.szabo@xxxxxxxx > m%3e%3cmailto:istvan.szabo@xxxxxxxxx>> > --------------------------------------------------- > > -----Original Message----- > From: Eugen Block > <eblock@xxxxxx<mailto:eblock@xxxxxx<mailto:eblock@xxxxxx%3cmailto:eblo > ck@xxxxxx>><mailto:eblock@xxxxxx><mailto:eblock@xxxxxx>> > Sent: Wednesday, September 29, 2021 8:49 PM > To: 胡 玮文 > <huww98@xxxxxxxxxxx<mailto:huww98@xxxxxxxxxxx<mailto:huww98@xxxxxxxxxx > m%3cmailto:huww98@xxxxxxxxxxx>><mailto:huww98@xxxxxxxxxxx><mailto:huww > 98@xxxxxxxxxxx>> > Cc: Igor Fedotov > <ifedotov@xxxxxxx<mailto:ifedotov@xxxxxxx<mailto:ifedotov@xxxxxxx%3cma > ilto:ifedotov@xxxxxxx>><mailto:ifedotov@xxxxxxx><mailto:ifedotov@suse. > de>>; > Szabo, > Istvan (Agoda) > <Istvan.Szabo@xxxxxxxxx<mailto:Istvan.Szabo@xxxxxxxxx<mailto:Istvan.Sz > abo@xxxxxxxxx%3cmailto:Istvan.Szabo@xxxxxxxxx>><mailto:Istvan.Szabo@ag > oda.com><mailto:Istvan.Szabo@xxxxxxxxx>>; > ceph-users@xxxxxxx<mailto:ceph-users@xxxxxxx><mailto:ceph-users@ceph.i > o><mailto:ceph-users@xxxxxxx<mailto:ceph-users@xxxxxxx%3e%3cmailto:cep > h-users@xxxxxxx%3e%3cmailto:ceph-users@xxxxxxx>> > Subject: Re: is it possible to remove the db+wal from an external > device (nvme) > > Email received from the internet. If in doubt, don't click any link > nor open any attachment ! > ________________________________ > > That's what I did and pasted the results in my previous comments. > > > Zitat von 胡 玮文 > <huww98@xxxxxxxxxxx<mailto:huww98@xxxxxxxxxxx<mailto:huww98@xxxxxxxxxxx%3cmailto:huww98@xxxxxxxxxxx>><mailto:huww98@xxxxxxxxxxx><mailto:huww98@xxxxxxxxxxx>>: > > Yes. And “cephadm shell” command does not depend on the running > daemon, it will start a new container. So I think it is perfectly fine > to stop the OSD first then run the “cephadm shell” command, and run > ceph-volume in the new shell. > > 发件人: Eugen > Block<mailto:eblock@xxxxxx<mailto:eblock@xxxxxx><mailto:eblock@xxxxxx< > mailto:eblock@xxxxxx%3cmailto:eblock@xxxxxx%3e%3cmailto:eblock@xxxxxx> > >> > 发送时间: 2021年9月29日 21:40 > 收件人: 胡 > 玮文<mailto:huww98@xxxxxxxxxxx<mailto:huww98@xxxxxxxxxxx><mailto:huww98@ > outlook.com<mailto:huww98@xxxxxxxxxxx%3cmailto:huww98@xxxxxxxxxxx%3e%3 > cmailto:huww98@xxxxxxxxxxx>>> > 抄送: Igor > Fedotov<mailto:ifedotov@xxxxxxx<mailto:ifedotov@xxxxxxx><mailto:ifedot > ov@xxxxxxx<mailto:ifedotov@xxxxxxx%3cmailto:ifedotov@xxxxxxx%3e%3cmail > to:ifedotov@xxxxxxx>>>; > Szabo, Istvan > (Agoda)<mailto:Istvan.Szabo@xxxxxxxxx<mailto:Istvan.Szabo@xxxxxxxxx<ma > ilto:Istvan.Szabo@xxxxxxxxx%3cmailto:Istvan.Szabo@xxxxxxxxx>><mailto:I > stvan.Szabo@xxxxxxxxx> > ; > ceph-users@xxxxxxx<mailto:ceph-users@xxxxxxx><mailto:ceph-users@ceph.i > o><mailto:ceph-users@xxxxxxx<mailto:ceph-users@xxxxxxx%3e%3cmailto:cep > h-users@xxxxxxx%3e%3cmailto:ceph-users@xxxxxxx>><mailto:ceph-users@cep > h > .io<mailto:ceph-users@xxxxxxx><mailto:ceph-users@xxxxxxx<mailto:ceph-u > sers@xxxxxxx%3e%3cmailto:ceph-users@xxxxxxx>>> > 主题: Re: is it possible to remove the db+wal from an external device > (nvme) > > The OSD has to be stopped in order to migrate DB/WAL, it can't be done > live. ceph-volume requires a lock on the device. > > > Zitat von 胡 玮文 > <huww98@xxxxxxxxxxx<mailto:huww98@xxxxxxxxxxx<mailto:huww98@xxxxxxxxxxx%3cmailto:huww98@xxxxxxxxxxx>><mailto:huww98@xxxxxxxxxxx><mailto:huww98@xxxxxxxxxxx>>: > > I’ve not tried it, but how about: > > cephadm shell -n osd.0 > > then run “ceph-volume” commands in the newly opened shell. The > directory structure seems fine. > > $ sudo cephadm shell -n osd.0 > Inferring fsid e88d509a-f6fc-11ea-b25d-a0423f3ac864 > Inferring config > /var/lib/ceph/e88d509a-f6fc-11ea-b25d-a0423f3ac864/osd.0/config > Using recent ceph image > cr.example.com/infra/ceph@sha256:8a0f6f285edcd6488e2c91d3f9fa43534d<ma > ilto:cr.example.com/infra/ceph@sha256:8a0f6f285edcd6488e2c91d3f9fa4353 > 4d<mailto:cr.example.com/infra/ceph@sha256:8a0f6f285edcd6488e2c91d3f9f > a43534d%3cmailto:cr.example.com/infra/ceph@sha256:8a0f6f285edcd6488e2c > 91d3f9fa43534d>> > 37<http://cr.example.com/infra/ceph@sha256:8a0f6f285edcd6488e2c91d3 > f9fa43534d37<http://cr.example.com/infra/ceph@sha256:8a0f6f285edcd6488 > e2c91d3%0bf9fa43534d37>> > d7a9b37db1e0ff6691aae6466530 root@host0:/# ll > /var/lib/ceph/osd/ceph-0/ total 68 > drwx------ 2 ceph ceph 4096 Sep 20 04:15 ./ > drwxr-x--- 1 ceph ceph 4096 Sep 29 13:32 ../ > lrwxrwxrwx 1 ceph ceph 24 Sep 20 04:15 block -> /dev/ceph-hdd/osd.0.data > lrwxrwxrwx 1 ceph ceph 23 Sep 20 04:15 block.db -> > /dev/ubuntu-vg/osd.0.db > -rw------- 1 ceph ceph 37 Sep 20 04:15 ceph_fsid > -rw------- 1 ceph ceph 387 Jun 21 13:24 config > -rw------- 1 ceph ceph 37 Sep 20 04:15 fsid > -rw------- 1 ceph ceph 55 Sep 20 04:15 keyring > -rw------- 1 ceph ceph 6 Sep 20 04:15 ready > -rw------- 1 ceph ceph 3 Apr 2 01:46 require_osd_release > -rw------- 1 ceph ceph 10 Sep 20 04:15 type > -rw------- 1 ceph ceph 38 Sep 17 14:26 unit.configured > -rw------- 1 ceph ceph 48 Nov 9 2020 unit.created > -rw------- 1 ceph ceph 35 Sep 17 14:26 unit.image > -rw------- 1 ceph ceph 306 Sep 17 14:26 unit.meta > -rw------- 1 ceph ceph 1317 Sep 17 14:26 unit.poststop > -rw------- 1 ceph ceph 3021 Sep 17 14:26 unit.run > -rw------- 1 ceph ceph 142 Sep 17 14:26 unit.stop > -rw------- 1 ceph ceph 2 Sep 20 04:15 whoami > > 发件人: Eugen > Block<mailto:eblock@xxxxxx<mailto:eblock@xxxxxx><mailto:eblock@xxxxxx< > mailto:eblock@xxxxxx%3cmailto:eblock@xxxxxx%3e%3cmailto:eblock@xxxxxx> > >> > 发送时间: 2021年9月29日 21:29 > 收件人: Igor > Fedotov<mailto:ifedotov@xxxxxxx<mailto:ifedotov@xxxxxxx><mailto:ifedot > ov@xxxxxxx<mailto:ifedotov@xxxxxxx%3cmailto:ifedotov@xxxxxxx%3e%3cmail > to:ifedotov@xxxxxxx>>> > 抄送: 胡 > 玮文<mailto:huww98@xxxxxxxxxxx<mailto:huww98@xxxxxxxxxxx><mailto:huww98@ > outlook.com<mailto:huww98@xxxxxxxxxxx%3cmailto:huww98@xxxxxxxxxxx%3e%3 > cmailto:huww98@xxxxxxxxxxx>>>; > Szabo, Istvan > (Agoda)<mailto:Istvan.Szabo@xxxxxxxxx<mailto:Istvan.Szabo@xxxxxxxxx > ; > ceph-users@xxxxxxx<mailto:ceph-users@xxxxxxx><mailto:ceph-users@ceph.i > o><mailto:ceph-users@xxxxxxx<mailto:ceph-users@xxxxxxx%3e%3cmailto:cep > h-users@xxxxxxx%3e%3cmailto:ceph-users@xxxxxxx>><mailto:ceph-users@cep > h.io<mailto:ceph-users@xxxxxxx><mailto:ceph-users@xxxxxxx<mailto:ceph- > users@xxxxxxx%3e%3cmailto:ceph-users@xxxxxxx>>> > 主题: Re: Re: 回复: [ceph-users] Re: is it possible to remove > the db+wal from an external device (nvme) > > Hi Igor, > > thanks for your input. I haven't done this in a prod env yet either, > still playing around in a virtual lab env. > I tried the symlink suggestion but it's not that easy, because it > looks different underneath the ceph directory than ceph-volume expects > it. These are the services underneath: > > ses7-host1:~ # ll > /var/lib/ceph/152fd738-01bc-11ec-a7fd-fa163e672db2/ > insgesamt 48 > drwx------ 3 root root 4096 16. Sep 16:11 alertmanager.ses7-host1 > drwx------ 3 ceph ceph 4096 29. Sep 09:03 crash > drwx------ 2 ceph ceph 4096 16. Sep 16:39 crash.ses7-host1 > drwx------ 4 messagebus lp 4096 16. Sep 16:23 grafana.ses7-host1 > drw-rw---- 2 root root 4096 24. Aug 10:00 home > drwx------ 2 ceph ceph 4096 16. Sep 16:37 mgr.ses7-host1.wmgyit > drwx------ 3 ceph ceph 4096 16. Sep 16:37 mon.ses7-host1 > drwx------ 2 nobody nobody 4096 16. Sep 16:37 node-exporter.ses7-host1 > drwx------ 2 ceph ceph 4096 29. Sep 08:43 osd.0 > drwx------ 2 ceph ceph 4096 29. Sep 15:11 osd.1 > drwx------ 4 root root 4096 16. Sep 16:12 prometheus.ses7-host1 > > > While the directory in a non-containerized deployment looks like this: > > nautilus:~ # ll /var/lib/ceph/osd/ceph-0/ insgesamt 24 lrwxrwxrwx 1 > ceph ceph 93 29. Sep 12:21 block -> > /dev/ceph-a6d78a29-637f-494b-a839-76251fcff67e/osd-block-39340a48-5 > 4b > 3-4689-9896-f54d005c535d > -rw------- 1 ceph ceph 37 29. Sep 12:21 ceph_fsid > -rw------- 1 ceph ceph 37 29. Sep 12:21 fsid > -rw------- 1 ceph ceph 55 29. Sep 12:21 keyring > -rw------- 1 ceph ceph 6 29. Sep 12:21 ready > -rw------- 1 ceph ceph 10 29. Sep 12:21 type > -rw------- 1 ceph ceph 2 29. Sep 12:21 whoami > > > But even if I create the symlink to the osd directory it fails because > I only have ceph-volume within the containers where the symlink is not > visible to cephadm. > > > ses7-host1:~ # ll /var/lib/ceph/osd/ceph-1 lrwxrwxrwx 1 root root > 57 29. Sep 15:08 /var/lib/ceph/osd/ceph-1 -> > /var/lib/ceph/152fd738-01bc-11ec-a7fd-fa163e672db2/osd.1/ > > ses7-host1:~ # cephadm ceph-volume lvm migrate --osd-id 1 --osd-fsid > b4c772aa-07f8-483d-ae58-0ab97b8d0cc4 --from db --target > ceph-b1ddff4b-95e8-4b91-b451-a3ea35d16ec0/osd-block-b4c772aa-07f8-4 > 83 > d-ae58-0ab97b8d0cc4 Inferring fsid > 152fd738-01bc-11ec-a7fd-fa163e672db2 > [...] > /usr/bin/podman: stderr --> Migrate to existing, Source: > ['--devs-source', '/var/lib/ceph/osd/ceph-1/block.db'] Target: > /var/lib/ceph/osd/ceph-1/block > /usr/bin/podman: stderr stdout: inferring bluefs devices from > bluestore path > /usr/bin/podman: stderr stderr: can't migrate > /var/lib/ceph/osd/ceph-1/block.db, not a valid bluefs volume > /usr/bin/podman: stderr --> Failed to migrate device, error code:1 > /usr/bin/podman: stderr --> Undoing lv tag set > /usr/bin/podman: stderr Failed to migrate to : > ceph-b1ddff4b-95e8-4b91-b451-a3ea35d16ec0/osd-block-b4c772aa-07f8-4 > 83 > d-ae58-0ab97b8d0cc4 > Traceback (most recent call last): > File "/usr/sbin/cephadm", line 6225, in <module> > r = args.func() > File "/usr/sbin/cephadm", line 1363, in _infer_fsid > return func() > File "/usr/sbin/cephadm", line 1422, in _infer_image > return func() > File "/usr/sbin/cephadm", line 3687, in command_ceph_volume > out, err, code = call_throws(c.run_cmd(), > verbosity=CallVerbosity.VERBOSE) > File "/usr/sbin/cephadm", line 1101, in call_throws > raise RuntimeError('Failed command: %s' % ' '.join(command)) > [...] > > > I could install the package ceph-osd (where ceph-volume is packaged > in) but it's not available by default (as you see this is a SES 7 > environment). > > I'm not sure what the design is here, it feels like the ceph-volume > migrate command is not applicable to containers yet. > > Regards, > Eugen > > > Zitat von Igor Fedotov > <ifedotov@xxxxxxx<mailto:ifedotov@xxxxxxx<mailto:ifedotov@xxxxxxx%3cmailto:ifedotov@xxxxxxx>><mailto:ifedotov@xxxxxxx><mailto:ifedotov@xxxxxxx>>: > > Hi Eugen, > > indeed this looks like an issue related to containerized deployment, > "ceph-volume lvm migrate" expects osd folder to be under > /var/lib/ceph/osd: > > stderr: 2021-09-29T06:56:24.787+0000 7fde05b96180 -1 > bluestore(/var/lib/ceph/osd/ceph-1) _lock_fsid failed to lock > /var/lib/ceph/osd/ceph-1/fsid (is another ceph-osd still > running?)(11) Resource temporarily unavailable As a workaround you > might want to try to create a symlink to your actual location before > issuing the migrate command: > /var/lib/ceph/osd -> > /var/lib/ceph/152fd738-01bc-11ec-a7fd-fa163e672db2/ > > More complicated (and more general IMO) way would be to run the > migrate command from within a container deployed similarly (i.e. > with all the proper subfolder mappings) to ceph-osd one. Just > speculating - not a big expert in containers and never tried that with > properly deployed production cluster... > > > Thanks, > > Igor > > On 9/29/2021 10:07 AM, Eugen Block wrote: > Hi, > > I just tried with 'ceph-volume lvm migrate' in Octopus but it doesn't > really work. I'm not sure if I'm missing something here, but I believe > it's again the already discussed containers issue. > To be able to run the command for an OSD the OSD has to be offline, > but then you don't have access to the block.db because the path is > different from outside the container: > > ---snip--- > [ceph: root@host1 /]# ceph-volume lvm migrate --osd-id 1 --osd-fsid > b4c772aa-07f8-483d-ae58-0ab97b8d0cc4 --from db --target > ceph-b1ddff4b-95e8-4b91-b451-a3ea35d16ec0/osd-block-b4c772aa-07f8 > -4 > 83d-ae58-0ab97b8d0cc4 --> Migrate to existing, Source: > ['--devs-source', '/var/lib/ceph/osd/ceph-1/block.db'] > Target: > /var/lib/ceph/osd/ceph-1/block > stdout: inferring bluefs devices from bluestore path > stderr: > /home/abuild/rpmbuild/BUILD/ceph-15.2.14-84-gb6e5642e260/src/os/b > lu > estore/BlueStore.cc: In function 'int > BlueStore::_mount_for_bluefs()' thread > 7fde05b96180 > time > 2021-09-29T06:56:24.790161+0000 > stderr: > /home/abuild/rpmbuild/BUILD/ceph-15.2.14-84-gb6e5642e260/src/os/b > lu > estore/BlueStore.cc: 6876: FAILED ceph_assert(r == > 0) > stderr: 2021-09-29T06:56:24.787+0000 7fde05b96180 -1 > bluestore(/var/lib/ceph/osd/ceph-1) _lock_fsid failed to lock > /var/lib/ceph/osd/ceph-1/fsid (is another ceph-osd still > running?)(11) Resource temporarily unavailable > > > # path outside > host1:~ # ll > /var/lib/ceph/152fd738-01bc-11ec-a7fd-fa163e672db2/osd.1/ > insgesamt 60 > lrwxrwxrwx 1 ceph ceph 93 29. Sep 08:43 block -> > /dev/ceph-b1ddff4b-95e8-4b91-b451-a3ea35d16ec0/osd-block-b4c772aa-07f8-483d-ae58-0ab97b8d0cc4 > lrwxrwxrwx 1 ceph ceph 90 29. Sep 08:43 block.db -> > /dev/ceph-6f1b8f49-daf2-4631-a2ef-12e9452b01ea/osd-db-69b11aa0-af > 96 > -443e-8f03-5afa5272131f > ---snip--- > > > But if I shutdown the OSD I can't access the block and block.db > devices. I'm not even sure how this is supposed to work with cephadm. > Maybe I'm misunderstanding, though. Or is there a way to provide the > offline block.db path to 'ceph-volume lvm migrate'? > > > > Zitat von 胡 玮文 > <huww98@xxxxxxxxxxx<mailto:huww98@xxxxxxxxxxx<mailto:huww98@xxxxxxxxxxx%3cmailto:huww98@xxxxxxxxxxx>><mailto:huww98@xxxxxxxxxxx><mailto:huww98@xxxxxxxxxxx>>: > > You may need to use `ceph-volume lvm migrate’ [1] instead of > ceph-bluestore-tool. If I recall correctly, this is a pretty new > feature, I’m not sure whether it is available to your version. > > If you use ceph-bluestore-tool, then you need to modify the LVM tags > manually. Please refer to the previous threads, e.g. [2] and some > more. > > [1]: https://docs.ceph.com/en/latest/man/8/ceph-volume/#migrate > [2]: > https://lists.ceph.io/hyperkitty/list/ceph-users@xxxxxxx/thread/ > VX > 23NQ66P3PPEX36T3PYYMHPLBSFLMYA/#JLNDFGXR4ZLY27DHD3RJTTZEDHRZJO4Q > > 发件人: Szabo, Istvan > (Agoda)<mailto:Istvan.Szabo@xxxxxxxxx<mailto:Istvan.Szabo@agoda. > com<mailto:Istvan.Szabo@xxxxxxxxx%3cmailto:Istvan.Szabo@agoda.%0bcom>> > > > 发送时间: 2021年9月28日 18:20 > 收件人: Eugen > Block<mailto:eblock@xxxxxx<mailto:eblock@xxxxxx><mailto:eblock@xxxxxx< > mailto:eblock@xxxxxx%3cmailto:eblock@xxxxxx%3e%3cmailto:eblock@xxxxxx> > >>; > ceph-users@xxxxxxx<mailto:ceph-users@xxxxxxx><mailto:ceph-users@ceph.i > o><mailto:ceph-users@xxxxxxx<mailto:ceph-users@xxxxxxx%3e%3cmailto:cep > h-users@xxxxxxx%3e%3cmailto:ceph-users@xxxxxxx>><mailto:ceph-users@ > ceph.io<mailto:ceph-users@xxxxxxx><mailto:ceph-users@xxxxxxx<mailto:ce > ph-users@xxxxxxx%3e%3cmailto:ceph-users@xxxxxxx>>> > 主题: Re: is it possible to remove the db+wal from an > external device (nvme) > > Gave a try of it, so all the 3 osds finally failed :/ Not sure what > went wrong. > > Do the normal maintenance things, ceph osd set noout, ceph osd set > norebalance, stop the osd and run this command: > ceph-bluestore-tool bluefs-bdev-migrate --dev-target > /var/lib/ceph/osd/ceph-0/block --devs-source > /var/lib/ceph/osd/ceph-8/block.db --path /var/lib/ceph/osd/ceph-8/ > Output: > device removed:1 /var/lib/ceph/osd/ceph-8/block.db device added: > 1 > /dev/dm-2 > > When tried to start I got this in the log: > osd.8 0 OSD:init: unable to mount object store > ** ERROR: osd init failed: (13) Permission denied set uid:gid to > 167:167 (ceph:ceph) ceph version 15.2.13 > (c44bc49e7a57a87d84dfff2a077a2058aa2172e2) > octopus (stable), process ceph-osd, pid 1512261 > pidfile_write: ignore empty --pid-file > > From the another 2 osds the block.db removed and I can start it back. > I've zapped the db drive just to be removed from the device completely > and after machine restart none of these 2 osds came back, I guess > missing the db device. > > Is there any steps missing? > 1.Noout+norebalance > 2. Stop osd > 3. migrate with the above command the block.db to the block. > 4. do on the other osds which is sharing the same db device that want > to remove. > 5. zap the db device > 6. start back the osds. > > Istvan Szabo > Senior Infrastructure Engineer > --------------------------------------------------- > Agoda Services Co., Ltd. > e: > istvan.szabo@xxxxxxxxx<mailto:istvan.szabo@xxxxxxxxx><mailto:istvan.sz > abo@xxxxxxxxx><mailto:istvan.szabo@xxxxxxxxx<mailto:istvan.szabo@agoda > .com%3cmailto:istvan.szabo@xxxxxxxxx%3e%3cmailto:istvan.szabo@xxxxxxxx > m%3e%3cmailto:istvan.szabo@xxxxxxxxx>> > --------------------------------------------------- > > -----Original Message----- > From: Eugen Block > <eblock@xxxxxx<mailto:eblock@xxxxxx<mailto:eblock@xxxxxx%3cmailto:eblo > ck@xxxxxx>><mailto:eblock@xxxxxx><mailto:eblock@xxxxxx>> > Sent: Monday, September 27, 2021 7:42 PM > To: > ceph-users@xxxxxxx<mailto:ceph-users@xxxxxxx><mailto:ceph-users@ceph.i > o><mailto:ceph-users@xxxxxxx<mailto:ceph-users@xxxxxxx%3e%3cmailto:cep > h-users@xxxxxxx%3e%3cmailto:ceph-users@xxxxxxx>> > Subject: Re: is it possible to remove the db+wal from an > external device (nvme) > > Email received from the internet. If in doubt, don't click any link > nor open any attachment ! > ________________________________ > > Hi, > > I think 'ceph-bluestore-tool bluefs-bdev-migrate' could be of use > here. I haven't tried it in a production environment yet, only in > virtual labs. > > Regards, > Eugen > > > Zitat von "Szabo, Istvan (Agoda)" > <Istvan.Szabo@xxxxxxxxx<mailto:Istvan.Szabo@xxxxxxxxx<mailto:Istvan.Szabo@xxxxxxxxx%3cmailto:Istvan.Szabo@xxxxxxxxx>><mailto:Istvan.Szabo@xxxxxxxxx><mailto:Istvan.Szabo@xxxxxxxxx>>: > > Hi, > > Seems like in our config the nvme device as a wal+db in front of the > ssd slowing down the ssds osds. > I'd like to avoid to rebuild all the osd-, is there a way somehow > migrate to the "slower device" the wal+db without reinstall? > > Ty > _______________________________________________ > ceph-users mailing list -- > ceph-users@xxxxxxx<mailto:ceph-users@xxxxxxx><mailto:ceph-users@ceph.i > o><mailto:ceph-users@xxxxxxx<mailto:ceph-users@xxxxxxx%3e%3cmailto:cep > h-users@xxxxxxx%3e%3cmailto:ceph-users@xxxxxxx>> To unsubscribe send > an email to > ceph-users-leave@xxxxxxx<mailto:ceph-users-leave@xxxxxxx><mailto:ceph- > users-leave@xxxxxxx><mailto:ceph-users-leave@xxxxxxx<mailto:ceph-users > -leave@xxxxxxx%3cmailto:ceph-users-leave@xxxxxxx%3e%3cmailto:ceph-user > s-leave@xxxxxxx%3e%3cmailto:ceph-users-leave@xxxxxxx>> > > > _______________________________________________ > ceph-users mailing list -- > ceph-users@xxxxxxx<mailto:ceph-users@xxxxxxx><mailto:ceph-users@ceph.i > o><mailto:ceph-users@xxxxxxx<mailto:ceph-users@xxxxxxx%3e%3cmailto:cep > h-users@xxxxxxx%3e%3cmailto:ceph-users@xxxxxxx>> To unsubscribe send > an email to > ceph-users-leave@xxxxxxx<mailto:ceph-users-leave@xxxxxxx><mailto:ceph- > users-leave@xxxxxxx><mailto:ceph-users-leave@xxxxxxx<mailto:ceph-users > -leave@xxxxxxx%3cmailto:ceph-users-leave@xxxxxxx%3e%3cmailto:ceph-user > s-leave@xxxxxxx%3e%3cmailto:ceph-users-leave@xxxxxxx>> > _______________________________________________ > ceph-users mailing list -- > ceph-users@xxxxxxx<mailto:ceph-users@xxxxxxx><mailto:ceph-users@ceph.i > o><mailto:ceph-users@xxxxxxx<mailto:ceph-users@xxxxxxx%3e%3cmailto:cep > h-users@xxxxxxx%3e%3cmailto:ceph-users@xxxxxxx>> To unsubscribe send > an email to > ceph-users-leave@xxxxxxx<mailto:ceph-users-leave@xxxxxxx><mailto:ceph- > users-leave@xxxxxxx><mailto:ceph-users-leave@xxxxxxx<mailto:ceph-users > -leave@xxxxxxx%3cmailto:ceph-users-leave@xxxxxxx%3e%3cmailto:ceph-user > s-leave@xxxxxxx%3e%3cmailto:ceph-users-leave@xxxxxxx>> > > > _______________________________________________ > ceph-users mailing list -- > ceph-users@xxxxxxx<mailto:ceph-users@xxxxxxx><mailto:ceph-users@ceph.i > o><mailto:ceph-users@xxxxxxx<mailto:ceph-users@xxxxxxx%3e%3cmailto:cep > h-users@xxxxxxx%3e%3cmailto:ceph-users@xxxxxxx>> To unsubscribe send > an email to > ceph-users-leave@xxxxxxx<mailto:ceph-users-leave@xxxxxxx><mailto:ceph- > users-leave@xxxxxxx><mailto:ceph-users-leave@xxxxxxx<mailto:ceph-users > -leave@xxxxxxx%3cmailto:ceph-users-leave@xxxxxxx%3e%3cmailto:ceph-user > s-leave@xxxxxxx%3e%3cmailto:ceph-users-leave@xxxxxxx>> > > > _______________________________________________ > ceph-users mailing list -- > ceph-users@xxxxxxx<mailto:ceph-users@xxxxxxx><mailto:ceph-users@ceph.i > o><mailto:ceph-users@xxxxxxx<mailto:ceph-users@xxxxxxx%3e%3cmailto:cep > h-users@xxxxxxx%3e%3cmailto:ceph-users@xxxxxxx>> > To unsubscribe send an email to > ceph-users-leave@xxxxxxx<mailto:ceph-users-leave@xxxxxxx><mailto:ceph- > users-leave@xxxxxxx><mailto:ceph-users-leave@xxxxxxx<mailto:ceph-users > -leave@xxxxxxx%3cmailto:ceph-users-leave@xxxxxxx%3e%3cmailto:ceph-user > s-leave@xxxxxxx%3e%3cmailto:ceph-users-leave@xxxxxxx>> > _______________________________________________ > ceph-users mailing list -- > ceph-users@xxxxxxx<mailto:ceph-users@xxxxxxx> To unsubscribe send an > email to > ceph-users-leave@xxxxxxx<mailto:ceph-users-leave@xxxxxxx<mailto:ceph-u > sers-leave@xxxxxxx%3cmailto:ceph-users-leave@xxxxxxx>> > _______________________________________________ > ceph-users mailing list -- > ceph-users@xxxxxxx<mailto:ceph-users@xxxxxxx> To unsubscribe send an > email to > ceph-users-leave@xxxxxxx<mailto:ceph-users-leave@xxxxxxx<mailto:ceph-u > sers-leave@xxxxxxx%3cmailto:ceph-users-leave@xxxxxxx>> > _______________________________________________ > ceph-users mailing list -- > ceph-users@xxxxxxx<mailto:ceph-users@xxxxxxx> > To unsubscribe send an email to > ceph-users-leave@xxxxxxx<mailto:ceph-users-leave@xxxxxxx<mailto:ceph-u > sers-leave@xxxxxxx%3cmailto:ceph-users-leave@xxxxxxx>> > _______________________________________________ > ceph-users mailing list -- > ceph-users@xxxxxxx<mailto:ceph-users@xxxxxxx> > To unsubscribe send an email to > ceph-users-leave@xxxxxxx<mailto:ceph-users-leave@xxxxxxx> _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx