Re: crashing OSD: ceph_assert(is_valid_io(off, len))

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



When I try
ceph-kvstore-tool bluestore-kv /var/lib/ceph/osd/ceph-$OSD fsck
I get:

2020-06-08 16:05:39.393 7fc589500d80 1 bluestore(/var/lib/ceph/osd/ceph-244) _mount path /var/lib/ceph/osd/ceph-244 2020-06-08 16:05:39.393 7fc589500d80 1 bdev create path /var/lib/ceph/osd/ceph-244/block type kernel 2020-06-08 16:05:39.393 7fc589500d80 1 bdev(0x559556ae8700 /var/lib/ceph/osd/ceph-244/block) open path /var/lib/ceph/osd/ceph-244/block 2020-06-08 16:05:39.393 7fc589500d80 1 bdev(0x559556ae8700 /var/lib/ceph/osd/ceph-244/block) open size 4000783007744 (0x3a381400000, 3.6 TiB) block_size 4096 (4 KiB) rotational discard not supported 2020-06-08 16:05:39.397 7fc589500d80 1 bluestore(/var/lib/ceph/osd/ceph-244) _set_cache_sizes cache_size 1073741824 meta 0.4 kv 0.4 data 0.2 2020-06-08 16:05:39.397 7fc589500d80 1 bdev create path /var/lib/ceph/osd/ceph-244/block.db type kernel 2020-06-08 16:05:39.397 7fc589500d80 1 bdev(0x559556ae8e00 /var/lib/ceph/osd/ceph-244/block.db) open path /var/lib/ceph/osd/ceph-244/block.db 2020-06-08 16:05:39.397 7fc589500d80 1 bdev(0x559556ae8e00 /var/lib/ceph/osd/ceph-244/block.db) open size 19855835136 (0x49f800000, 18 GiB) block_size 4096 (4 KiB) non-rotational discard supported 2020-06-08 16:05:39.397 7fc589500d80 1 bluefs add_block_device bdev 1 path /var/lib/ceph/osd/ceph-244/block.db size 18 GiB 2020-06-08 16:05:39.397 7fc589500d80 1 bdev create path /var/lib/ceph/osd/ceph-244/block type kernel 2020-06-08 16:05:39.397 7fc589500d80 1 bdev(0x559556ae9180 /var/lib/ceph/osd/ceph-244/block) open path /var/lib/ceph/osd/ceph-244/block 2020-06-08 16:05:39.397 7fc589500d80 1 bdev(0x559556ae9180 /var/lib/ceph/osd/ceph-244/block) open size 4000783007744 (0x3a381400000, 3.6 TiB) block_size 4096 (4 KiB) rotational discard not supported 2020-06-08 16:05:39.397 7fc589500d80 1 bluefs add_block_device bdev 2 path /var/lib/ceph/osd/ceph-244/block size 3.6 TiB
2020-06-08 16:05:39.397 7fc589500d80  1 bluefs mount
2020-06-08 16:05:39.397 7fc589500d80 1 bluefs _init_alloc id 1 alloc_size 0x100000 size 0x49f800000 2020-06-08 16:05:39.397 7fc589500d80 1 bluefs _init_alloc id 2 alloc_size 0x10000 size 0x3a381400000
*** Caught signal (Segmentation fault) **
 in thread 7fc589500d80 thread_name:ceph-kvstore-to
ceph version 14.2.8 (2d095e947a02261ce61424021bb43bd3022d35cb) nautilus (stable)
 1: (()+0x12890) [0x7fc57f0be890]
2: (BlueFS::_read(BlueFS::FileReader*, BlueFS::FileReaderBuffer*, unsigned long, unsigned long, ceph::buffer::v14_2_0::list*, char*)+0x4e5) [0x559554cbe885]
 3: (BlueFS::_replay(bool, bool)+0x489) [0x559554cbf879]
 4: (BlueFS::mount()+0x219) [0x559554cd26a9]
 5: (BlueStore::_open_bluefs(bool)+0x41) [0x559554b2b521]
 6: (BlueStore::_open_db(bool, bool, bool)+0x88c) [0x559554b2c71c]
 7: (BlueStore::_open_db_and_around(bool)+0x44) [0x559554b433d4]
 8: (BlueStore::_mount(bool, bool)+0x584) [0x559554b93aa4]
9: (StoreTool::load_bluestore(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, bool)+0x43) [0x559554af3fc3] 10: (StoreTool::StoreTool(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, bool, bool)+0x236) [0x559554af54e6]
 11: (main()+0x261) [0x559554acfb71]
 12: (__libc_start_main()+0xe7) [0x7fc57df91b97]
 13: (_start()+0x2a) [0x559554af3d5a]
2020-06-08 16:05:39.453 7fc589500d80 -1 *** Caught signal (Segmentation fault) **
 in thread 7fc589500d80 thread_name:ceph-kvstore-to

ceph version 14.2.8 (2d095e947a02261ce61424021bb43bd3022d35cb) nautilus (stable)
 1: (()+0x12890) [0x7fc57f0be890]
2: (BlueFS::_read(BlueFS::FileReader*, BlueFS::FileReaderBuffer*, unsigned long, unsigned long, ceph::buffer::v14_2_0::list*, char*)+0x4e5) [0x559554cbe885]
 3: (BlueFS::_replay(bool, bool)+0x489) [0x559554cbf879]
 4: (BlueFS::mount()+0x219) [0x559554cd26a9]
 5: (BlueStore::_open_bluefs(bool)+0x41) [0x559554b2b521]
 6: (BlueStore::_open_db(bool, bool, bool)+0x88c) [0x559554b2c71c]
 7: (BlueStore::_open_db_and_around(bool)+0x44) [0x559554b433d4]
 8: (BlueStore::_mount(bool, bool)+0x584) [0x559554b93aa4]
9: (StoreTool::load_bluestore(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, bool)+0x43) [0x559554af3fc3] 10: (StoreTool::StoreTool(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, bool, bool)+0x236) [0x559554af54e6]
 11: (main()+0x261) [0x559554acfb71]
 12: (__libc_start_main()+0xe7) [0x7fc57df91b97]
 13: (_start()+0x2a) [0x559554af3d5a]
NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.

--- begin dump of recent events ---
-43> 2020-06-08 16:05:39.385 7fc589500d80 5 asok(0x559555eae000) register_command assert hook 0x559555e0c1b0 -42> 2020-06-08 16:05:39.385 7fc589500d80 5 asok(0x559555eae000) register_command abort hook 0x559555e0c1b0 -41> 2020-06-08 16:05:39.385 7fc589500d80 5 asok(0x559555eae000) register_command perfcounters_dump hook 0x559555e0c1b0 -40> 2020-06-08 16:05:39.385 7fc589500d80 5 asok(0x559555eae000) register_command 1 hook 0x559555e0c1b0 -39> 2020-06-08 16:05:39.385 7fc589500d80 5 asok(0x559555eae000) register_command perf dump hook 0x559555e0c1b0 -38> 2020-06-08 16:05:39.385 7fc589500d80 5 asok(0x559555eae000) register_command perfcounters_schema hook 0x559555e0c1b0 -37> 2020-06-08 16:05:39.385 7fc589500d80 5 asok(0x559555eae000) register_command perf histogram dump hook 0x559555e0c1b0 -36> 2020-06-08 16:05:39.385 7fc589500d80 5 asok(0x559555eae000) register_command 2 hook 0x559555e0c1b0 -35> 2020-06-08 16:05:39.385 7fc589500d80 5 asok(0x559555eae000) register_command perf schema hook 0x559555e0c1b0 -34> 2020-06-08 16:05:39.385 7fc589500d80 5 asok(0x559555eae000) register_command perf histogram schema hook 0x559555e0c1b0 -33> 2020-06-08 16:05:39.385 7fc589500d80 5 asok(0x559555eae000) register_command perf reset hook 0x559555e0c1b0 -32> 2020-06-08 16:05:39.385 7fc589500d80 5 asok(0x559555eae000) register_command config show hook 0x559555e0c1b0 -31> 2020-06-08 16:05:39.385 7fc589500d80 5 asok(0x559555eae000) register_command config help hook 0x559555e0c1b0 -30> 2020-06-08 16:05:39.385 7fc589500d80 5 asok(0x559555eae000) register_command config set hook 0x559555e0c1b0 -29> 2020-06-08 16:05:39.385 7fc589500d80 5 asok(0x559555eae000) register_command config unset hook 0x559555e0c1b0 -28> 2020-06-08 16:05:39.385 7fc589500d80 5 asok(0x559555eae000) register_command config get hook 0x559555e0c1b0 -27> 2020-06-08 16:05:39.385 7fc589500d80 5 asok(0x559555eae000) register_command config diff hook 0x559555e0c1b0 -26> 2020-06-08 16:05:39.385 7fc589500d80 5 asok(0x559555eae000) register_command config diff get hook 0x559555e0c1b0 -25> 2020-06-08 16:05:39.385 7fc589500d80 5 asok(0x559555eae000) register_command log flush hook 0x559555e0c1b0 -24> 2020-06-08 16:05:39.385 7fc589500d80 5 asok(0x559555eae000) register_command log dump hook 0x559555e0c1b0 -23> 2020-06-08 16:05:39.385 7fc589500d80 5 asok(0x559555eae000) register_command log reopen hook 0x559555e0c1b0 -22> 2020-06-08 16:05:39.385 7fc589500d80 5 asok(0x559555eae000) register_command dump_mempools hook 0x559556aae068 -21> 2020-06-08 16:05:39.393 7fc589500d80 1 bluestore(/var/lib/ceph/osd/ceph-244) _mount path /var/lib/ceph/osd/ceph-244 -20> 2020-06-08 16:05:39.393 7fc589500d80 1 bdev create path /var/lib/ceph/osd/ceph-244/block type kernel -19> 2020-06-08 16:05:39.393 7fc589500d80 1 bdev(0x559556ae8700 /var/lib/ceph/osd/ceph-244/block) open path /var/lib/ceph/osd/ceph-244/block -18> 2020-06-08 16:05:39.393 7fc589500d80 1 bdev(0x559556ae8700 /var/lib/ceph/osd/ceph-244/block) open size 4000783007744 (0x3a381400000, 3.6 TiB) block_size 4096 (4 KiB) rotational discard not supported -17> 2020-06-08 16:05:39.397 7fc589500d80 1 bluestore(/var/lib/ceph/osd/ceph-244) _set_cache_sizes cache_size 1073741824 meta 0.4 kv 0.4 data 0.2 -16> 2020-06-08 16:05:39.397 7fc589500d80 5 asok(0x559555eae000) register_command bluestore bluefs available hook 0x559555e0c530 -15> 2020-06-08 16:05:39.397 7fc589500d80 1 bdev create path /var/lib/ceph/osd/ceph-244/block.db type kernel -14> 2020-06-08 16:05:39.397 7fc589500d80 1 bdev(0x559556ae8e00 /var/lib/ceph/osd/ceph-244/block.db) open path /var/lib/ceph/osd/ceph-244/block.db -13> 2020-06-08 16:05:39.397 7fc589500d80 1 bdev(0x559556ae8e00 /var/lib/ceph/osd/ceph-244/block.db) open size 19855835136 (0x49f800000, 18 GiB) block_size 4096 (4 KiB) non-rotational discard supported -12> 2020-06-08 16:05:39.397 7fc589500d80 1 bluefs add_block_device bdev 1 path /var/lib/ceph/osd/ceph-244/block.db size 18 GiB -11> 2020-06-08 16:05:39.397 7fc589500d80 1 bdev create path /var/lib/ceph/osd/ceph-244/block type kernel -10> 2020-06-08 16:05:39.397 7fc589500d80 1 bdev(0x559556ae9180 /var/lib/ceph/osd/ceph-244/block) open path /var/lib/ceph/osd/ceph-244/block -9> 2020-06-08 16:05:39.397 7fc589500d80 1 bdev(0x559556ae9180 /var/lib/ceph/osd/ceph-244/block) open size 4000783007744 (0x3a381400000, 3.6 TiB) block_size 4096 (4 KiB) rotational discard not supported -8> 2020-06-08 16:05:39.397 7fc589500d80 1 bluefs add_block_device bdev 2 path /var/lib/ceph/osd/ceph-244/block size 3.6 TiB
    -7> 2020-06-08 16:05:39.397 7fc589500d80  1 bluefs mount
-6> 2020-06-08 16:05:39.397 7fc589500d80 1 bluefs _init_alloc id 1 alloc_size 0x100000 size 0x49f800000 -5> 2020-06-08 16:05:39.397 7fc589500d80 5 asok(0x559555eae000) register_command bluestore allocator dump bluefs-db hook 0x559555ef0cc0 -4> 2020-06-08 16:05:39.397 7fc589500d80 5 asok(0x559555eae000) register_command bluestore allocator score bluefs-db hook 0x559555ef0cc0 -3> 2020-06-08 16:05:39.397 7fc589500d80 1 bluefs _init_alloc id 2 alloc_size 0x10000 size 0x3a381400000 -2> 2020-06-08 16:05:39.397 7fc589500d80 5 asok(0x559555eae000) register_command bluestore allocator dump bluefs-slow hook 0x559555ef0c90 -1> 2020-06-08 16:05:39.397 7fc589500d80 5 asok(0x559555eae000) register_command bluestore allocator score bluefs-slow hook 0x559555ef0c90
[...]

On 08.06.20 15:48, Harald Staub wrote:
This is again about our bad cluster, with far too many objects. Now another OSD crashes immediately at startup:

/build/ceph-14.2.8/src/os/bluestore/KernelDevice.cc: 944: FAILED ceph_assert(is_valid_io(off, len))  1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x152) [0x5601938e0e92]  2: (ceph::__ceph_assertf_fail(char const*, char const*, int, char const*, char const*, ...)+0) [0x5601938e106d]  3: (KernelDevice::read(unsigned long, unsigned long, ceph::buffer::v14_2_0::list*, IOContext*, bool)+0x8e0) [0x560193f2ae90]  4: (BlueFS::_read(BlueFS::FileReader*, BlueFS::FileReaderBuffer*, unsigned long, unsigned long, ceph::buffer::v14_2_0::list*, char*)+0x4f6) [0x560193ee0506]
  5: (BlueFS::_replay(bool, bool)+0x489) [0x560193ee14e9]
  6: (BlueFS::mount()+0x219) [0x560193ef4319]
  7: (BlueStore::_open_bluefs(bool)+0x41) [0x560193de2281]
  8: (BlueStore::_open_db(bool, bool, bool)+0x88c) [0x560193de347c]
  9: (BlueStore::_open_db_and_around(bool)+0x44) [0x560193dfa134]
  10: (BlueStore::_mount(bool, bool)+0x584) [0x560193e4a804]
  11: (OSD::init()+0x3b7) [0x56019398f957]
  12: (main()+0x3cdb) [0x5601938e85cb]
  13: (__libc_start_main()+0xe7) [0x7f54cdf1bb97]
  14: (_start()+0x2a) [0x56019391b08a]

2020-06-08 13:44:52.063 7f54d169ec00 -1 *** Caught signal (Aborted) **
  in thread 7f54d169ec00 thread_name:ceph-osd

 ceph version 14.2.8 (2d095e947a02261ce61424021bb43bd3022d35cb) nautilus (stable)
  1: (()+0x12890) [0x7f54cf286890]
  2: (gsignal()+0xc7) [0x7f54cdf38e97]
  3: (abort()+0x141) [0x7f54cdf3a801]
 4: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x1a3) [0x5601938e0ee3]  5: (ceph::__ceph_assertf_fail(char const*, char const*, int, char const*, char const*, ...)+0) [0x5601938e106d]  6: (KernelDevice::read(unsigned long, unsigned long, ceph::buffer::v14_2_0::list*, IOContext*, bool)+0x8e0) [0x560193f2ae90]  7: (BlueFS::_read(BlueFS::FileReader*, BlueFS::FileReaderBuffer*, unsigned long, unsigned long, ceph::buffer::v14_2_0::list*, char*)+0x4f6) [0x560193ee0506]
  8: (BlueFS::_replay(bool, bool)+0x489) [0x560193ee14e9]
  9: (BlueFS::mount()+0x219) [0x560193ef4319]
  10: (BlueStore::_open_bluefs(bool)+0x41) [0x560193de2281]
  11: (BlueStore::_open_db(bool, bool, bool)+0x88c) [0x560193de347c]
  12: (BlueStore::_open_db_and_around(bool)+0x44) [0x560193dfa134]
  13: (BlueStore::_mount(bool, bool)+0x584) [0x560193e4a804]
  14: (OSD::init()+0x3b7) [0x56019398f957]
  15: (main()+0x3cdb) [0x5601938e85cb]
  16: (__libc_start_main()+0xe7) [0x7f54cdf1bb97]
  17: (_start()+0x2a) [0x56019391b08a]
 NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.

There was a preceding assert (earlier written about):

/build/ceph-14.2.8/src/os/bluestore/BlueFS.cc: 2261: FAILED ceph_assert(h->file->fnode.ino != 1)

Any ideas that I could try to save this OSDs?

Cheers
  Harry
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux