Hello, On my side at point of vm crash these are logs below. At the moment my debug is at 10 value. I will rise to 20 for full debug. these crashes are random and so far happens on very busy vms. Downgrading clients in host to Nautilus these crashes disappear Qemu is not shutting down in general because other vms on the same host continues working 2020-05-07T13:02:12.121+0300 7f88d57fa700 10 librbd::io::ReadResult: 0x7f88c80bfbf0 finish: got {} for [0,24576] bl 24576 2020-05-07T13:02:12.193+0300 7f88d57fa700 10 librbd::io::ReadResult: 0x7f88c80f9330 finish: C_ObjectReadRequest: r=0 2020-05-07T13:02:12.193+0300 7f88d57fa700 10 librbd::io::ReadResult: 0x7f88c80f9330 finish: got {} for [0,16384] bl 16384 2020-05-07T13:02:28.694+0300 7f890ba90500 10 librbd::ImageState: 0x5569b5da9bb0 0x5569b5da9bb0 send_close_unlock 2020-05-07T13:02:28.694+0300 7f890ba90500 10 librbd::ImageState: 0x5569b5da9bb0 0x5569b5da9bb0 send_close_unlock 2020-05-07T13:02:28.694+0300 7f890ba90500 10 librbd::image::CloseRequest: 0x7f88c8175fd0 send_block_image_watcher 2020-05-07T13:02:28.694+0300 7f890ba90500 10 librbd::ImageWatcher: 0x7f88c400dfe0 block_notifies 2020-05-07T13:02:28.694+0300 7f890ba90500 5 librbd::Watcher: 0x7f88c400dfe0 block_notifies: blocked_count=1 2020-05-07T13:02:28.694+0300 7f890ba90500 10 librbd::image::CloseRequest: 0x7f88c8175fd0 handle_block_image_watcher: r=0 2020-05-07T13:02:28.694+0300 7f890ba90500 10 librbd::image::CloseRequest: 0x7f88c8175fd0 send_shut_down_update_watchers 2020-05-07T13:02:28.694+0300 7f88d4ff9700 10 librbd::image::CloseRequest: 0x7f88c8175fd0 handle_shut_down_update_watchers: r=0 2020-05-07T13:02:28.694+0300 7f88d4ff9700 10 librbd::image::CloseRequest: 0x7f88c8175fd0 send_shut_down_io_queue 2020-05-07T13:02:28.694+0300 7f88d4ff9700 5 librbd::io::ImageRequestWQ: 0x7f88e8001570 shut_down: shut_down: in_flight=0 2020-05-07T13:02:28.694+0300 7f88d4ff9700 10 librbd::image::CloseRequest: 0x7f88c8175fd0 handle_shut_down_io_queue: r=0 2020-05-07T13:02:28.694+0300 7f88d4ff9700 10 librbd::image::CloseRequest: 0x7f88c8175fd0 send_shut_down_exclusive_lock 2020-05-07T13:02:28.694+0300 7f88d4ff9700 10 librbd::ExclusiveLock: 0x7f88c4011ba0 shut_down 2020-05-07T13:02:28.694+0300 7f88d4ff9700 10 librbd::ManagedLock: 0x7f88c4011bb8 shut_down: 2020-05-07T13:02:28.694+0300 7f88d4ff9700 10 librbd::ManagedLock: 0x7f88c4011bb8 send_shutdown: 2020-05-07T13:02:28.694+0300 7f88d4ff9700 10 librbd::ManagedLock: 0x7f88c4011bb8 send_shutdown_release: 2020-05-07T13:02:28.694+0300 7f88d4ff9700 10 librbd::ExclusiveLock: 0x7f88c4011ba0 pre_release_lock_handler 2020-05-07T13:02:28.694+0300 7f88d4ff9700 10 librbd::exclusive_lock::PreReleaseRequest: 0x7f88c80b6020 send_cancel_op_requests: 2020-05-07T13:02:28.694+0300 7f88d4ff9700 10 librbd::exclusive_lock::PreReleaseRequest: 0x7f88c80b6020 handle_cancel_op_requests: r=0 2020-05-07T13:02:28.694+0300 7f88d4ff9700 10 librbd::exclusive_lock::PreReleaseRequest: 0x7f88c80b6020 send_block_writes: 2020-05-07T13:02:28.698+0300 7f88d4ff9700 5 librbd::io::ImageRequestWQ: 0x7f88e8001570 block_writes: 0x5569b5e1ffd0, num=1 2020-05-07T13:02:28.698+0300 7f88d4ff9700 10 librbd::exclusive_lock::PreReleaseRequest: 0x7f88c80b6020 handle_block_writes: r=0 2020-05-07T13:02:28.698+0300 7f88d4ff9700 10 librbd::exclusive_lock::PreReleaseRequest: 0x7f88c80b6020 send_wait_for_ops: 2020-05-07T13:02:28.698+0300 7f88d4ff9700 10 librbd::exclusive_lock::PreReleaseRequest: 0x7f88c80b6020 handle_wait_for_ops: 2020-05-07T13:02:28.698+0300 7f88d4ff9700 10 librbd::exclusive_lock::PreReleaseRequest: 0x7f88c80b6020 send_invalidate_cache: 2020-05-07T13:02:28.698+0300 7f88d4ff9700 5 librbd::io::ObjectDispatcher: 0x5569b5dab700 invalidate_cache: 2020-05-07T13:02:28.698+0300 7f88d4ff9700 10 librbd::exclusive_lock::PreReleaseRequest: 0x7f88c80b6020 handle_invalidate_cache: r=0 2020-05-07T13:02:28.698+0300 7f88d4ff9700 10 librbd::exclusive_lock::PreReleaseRequest: 0x7f88c80b6020 send_flush_notifies: 2020-05-07T13:02:28.698+0300 7f88d4ff9700 10 librbd::exclusive_lock::PreReleaseRequest: 0x7f88c80b6020 handle_flush_notifies: 2020-05-07T13:02:28.698+0300 7f88d4ff9700 10 librbd::exclusive_lock::PreReleaseRequest: 0x7f88c80b6020 send_close_object_map: 2020-05-07T13:02:28.698+0300 7f88d4ff9700 10 librbd::object_map::UnlockRequest: 0x7f88c807a450 send_unlock: oid=rbd_object_map.2f18f2a67fad72 2020-05-07T13:02:28.702+0300 7f88d57fa700 10 librbd::object_map::UnlockRequest: 0x7f88c807a450 handle_unlock: r=0 2020-05-07T13:02:28.702+0300 7f88d57fa700 10 librbd::exclusive_lock::PreReleaseRequest: 0x7f88c80b6020 handle_close_object_map: r=0 2020-05-07T13:02:28.702+0300 7f88d57fa700 10 librbd::exclusive_lock::PreReleaseRequest: 0x7f88c80b6020 send_unlock: 2020-05-07T13:02:28.702+0300 7f88d4ff9700 10 librbd::ManagedLock: 0x7f88c4011bb8 handle_shutdown_pre_release: r=0 2020-05-07T13:02:28.702+0300 7f88d4ff9700 10 librbd::managed_lock::ReleaseRequest: 0x7f88c80b68a0 send_unlock: entity=client.58292796, cookie=auto 140225447738256 2020-05-07T13:02:28.702+0300 7f88d57fa700 10 librbd::managed_lock::ReleaseRequest: 0x7f88c80b68a0 handle_unlock: r=0 2020-05-07T13:02:28.702+0300 7f88d4ff9700 10 librbd::ExclusiveLock: 0x7f88c4011ba0 post_release_lock_handler: r=0 shutting_down=1 2020-05-07T13:02:28.702+0300 7f88d4ff9700 5 librbd::io::ImageRequestWQ: 0x7f88e8001570 unblock_writes: 0x5569b5e1ffd0, num=0 2020-05-07T13:02:28.702+0300 7f88d4ff9700 10 librbd::ImageWatcher: 0x7f88c400dfe0 notify released lock 2020-05-07T13:02:28.702+0300 7f88d4ff9700 10 librbd::ImageWatcher: 0x7f88c400dfe0 current lock owner: [0,0] 2020-05-07T13:02:28.702+0300 7f88d4ff9700 10 librbd::ManagedLock: 0x7f88c4011bb8 handle_shutdown_post_release: r=0 2020-05-07T13:02:28.702+0300 7f88d4ff9700 10 librbd::ManagedLock: 0x7f88c4011bb8 wait_for_tracked_ops: r=0 2020-05-07T13:02:28.702+0300 7f88d4ff9700 10 librbd::ManagedLock: 0x7f88c4011bb8 complete_shutdown: r=0 2020-05-07T13:02:28.702+0300 7f88d4ff9700 10 librbd::image::CloseRequest: 0x7f88c8175fd0 handle_shut_down_exclusive_lock: r=0 2020-05-07T13:02:28.702+0300 7f88d4ff9700 10 librbd::image::CloseRequest: 0x7f88c8175fd0 send_unregister_image_watcher 2020-05-07T13:02:28.702+0300 7f88d4ff9700 10 librbd::ImageWatcher: 0x7f88c400dfe0 unregistering image watcher 2020-05-07T13:02:28.702+0300 7f88d4ff9700 10 librbd::Watcher: 0x7f88c400dfe0 unregister_watch: 2020-05-07T13:02:28.702+0300 7f88d57fa700 5 librbd::Watcher: 0x7f88c400dfe0 notifications_blocked: blocked=1 2020-05-07T13:02:28.706+0300 7f88ceffd700 10 librbd::image::CloseRequest: 0x7f88c8175fd0 handle_unregister_image_watcher: r=0 2020-05-07T13:02:28.706+0300 7f88ceffd700 10 librbd::image::CloseRequest: 0x7f88c8175fd0 send_flush_readahead 2020-05-07T13:02:28.706+0300 7f88d4ff9700 10 librbd::image::CloseRequest: 0x7f88c8175fd0 handle_flush_readahead: r=0 2020-05-07T13:02:28.706+0300 7f88d4ff9700 10 librbd::image::CloseRequest: 0x7f88c8175fd0 send_shut_down_object_dispatcher 2020-05-07T13:02:28.706+0300 7f88d4ff9700 5 librbd::io::ObjectDispatcher: 0x5569b5dab700 shut_down: 2020-05-07T13:02:28.706+0300 7f88d4ff9700 5 librbd::io::ObjectDispatch: 0x5569b5ee8360 shut_down: 2020-05-07T13:02:28.706+0300 7f88d4ff9700 5 librbd::io::SimpleSchedulerObjectDispatch: 0x7f88c4013ce0 shut_down: 2020-05-07T13:02:28.706+0300 7f88d4ff9700 5 librbd::cache::WriteAroundObjectDispatch: 0x7f88c8003780 shut_down: 2020-05-07T13:02:28.706+0300 7f88d4ff9700 10 librbd::image::CloseRequest: 0x7f88c8175fd0 handle_shut_down_object_dispatcher: r=0 2020-05-07T13:02:28.706+0300 7f88d4ff9700 10 librbd::image::CloseRequest: 0x7f88c8175fd0 send_flush_op_work_queue 2020-05-07T13:02:28.706+0300 7f88d4ff9700 10 librbd::image::CloseRequest: 0x7f88c8175fd0 handle_flush_op_work_queue: r=0 2020-05-07T13:02:28.706+0300 7f88d4ff9700 10 librbd::image::CloseRequest: 0x7f88c8175fd0 handle_flush_image_watcher: r=0 2020-05-07T13:02:28.706+0300 7f88d4ff9700 10 librbd::ImageState: 0x5569b5da9bb0 0x5569b5da9bb0 handle_close: r=0 On Fri, May 8, 2020 at 12:40 AM Brad Hubbard <bhubbard@xxxxxxxxxx> wrote: > On Fri, May 8, 2020 at 3:42 AM Erwin Lubbers <erwin@xxxxxxxxxxx> wrote: > > > > Hi, > > > > Did anyone found a way to resolve the problem? I'm seeing the same on a > clean Octopus Ceph installation on Ubuntu 18 with an Octopus compiled KVM > server running on CentOS 7.8. The KVM machine shows: > > > > [ 7682.233684] fn-radosclient[6060]: segfault at 2b19 ip > 00007f8165cc0a50 sp 00007f81397f6490 error 4 in > librbd.so.1.12.0[7f8165ab4000+537000] > > Are you able to either capture a backtrace from a coredump or set up > logging and hopefully capture a backtrace that way? > > > > > Ceph is healthy and stable for a few weeks and I did not get these > messages while running on KVM compiled with Luminous libraries. > > > > Regards, > > Erwin > > _______________________________________________ > > ceph-users mailing list -- ceph-users@xxxxxxx > > To unsubscribe send an email to ceph-users-leave@xxxxxxx > > > > > -- > Cheers, > Brad > _______________________________________________ > ceph-users mailing list -- ceph-users@xxxxxxx > To unsubscribe send an email to ceph-users-leave@xxxxxxx > _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx