Only features enabled are layering and deep-flatten: root@cephproxy01:~# rbd -p vms info c9c5db8e-7502-4acc-b670-af18bdf89886_disk rbd image 'c9c5db8e-7502-4acc-b670-af18bdf89886_disk': size 20480 MB in 5120 objects order 22 (4096 kB objects) block_name_prefix: rbd_data.f4e4a42ae8944a format: 2 features: layering, deep-flatten flags: I have debug logs. Should I open a RBD tracker ticket at http://tracker.ceph.com/projects/rbd/issues for this? -- Eric On 6/23/17, 6:58 AM, "Jason Dillaman" <jdillama@xxxxxxxxxx> wrote: Yes, I"d say they aren't related. Since you can repeat this issue after a fresh VM boot, can you enable debug-level logging for said VM (add "debug rbd = 20" to your ceph.conf) and recreate the issue. Just to confirm, this VM doesn't have any features enabled besides (perhaps) layering? On Fri, Jun 23, 2017 at 1:46 AM, Hall, Eric <eric.hall@xxxxxxxxxxxxxx> wrote: > The problem seems to be reliably reproducible after a fresh reboot of the VM… > > With this knowledge, I can cause the hung IO condition while having noscrub and nodeepscrub set. > > Does this confirm this is not-related to http://tracker.ceph.com/issues/20041 ? > > -- > Eric > > On 6/22/17, 11:23 AM, "Hall, Eric" <eric.hall@xxxxxxxxxxxxxx> wrote: > > After some testing (doing heavy IO on a rdb-based VM with hung_task_timeout_secs=1 while manually requesting deep-scrubs on the underlying pgs (as determined via rados ls->osdmaptool), I don’t think scrubbing is the cause. > > At least, I can’t make it happen this way… although I can’t *always* make it happen whileeither. I will continue testing as above, but suggestions on improved test methodology are welcome. > > > We occasionally see blocked requests in a running log (ceph –w > log), but not correlated with hung VM IO. Scrubbing doesn’t seem correlated either. > > -- > Eric > > On 6/21/17, 2:55 PM, "Jason Dillaman" <jdillama@xxxxxxxxxx> wrote: > > Do your VMs or OSDs show blocked requests? If you disable scrub or > restart the blocked OSD, does the issue go away? If yes, it most > likely is this issue [1]. > > [1] http://tracker.ceph.com/issues/20041 > > On Wed, Jun 21, 2017 at 3:33 PM, Hall, Eric <eric.hall@xxxxxxxxxxxxxx> wrote: > > The VMs are using stock Ubuntu14/16 images so yes, there is the default “/sbin/fstrim –all” in /etc/cron.weekly/fstrim. > > > > -- > > Eric > > > > On 6/21/17, 1:58 PM, "Jason Dillaman" <jdillama@xxxxxxxxxx> wrote: > > > > Are some or many of your VMs issuing periodic fstrims to discard > > unused extents? > > > > On Wed, Jun 21, 2017 at 2:36 PM, Hall, Eric <eric.hall@xxxxxxxxxxxxxx> wrote: > > > After following/changing all suggested items (turning off exclusive-lock > > > (and associated object-map and fast-diff), changing host cache behavior, > > > etc.) this is still a blocking issue for many uses of our OpenStack/Ceph > > > installation. > > > > > > > > > > > > We have upgraded Ceph to 10.2.7, are running 4.4.0-62 or later kernels on > > > all storage, compute hosts, and VMs, with libvirt 1.3.1 on compute hosts. > > > Have also learned quite a bit about producing debug logs. ;) > > > > > > > > > > > > I’ve followed the related threads since March with bated breath, but still > > > find no resolution. > > > > > > > > > > > > Previously, I got pulled away before I could produce/report discussed debug > > > info, but am back on the case now. Please let me know how I can help > > > diagnose and resolve this problem. > > > > > > > > > > > > Any assistance appreciated, > > > > > > -- > > > > > > Eric > > > > > > > > > > > > On 3/28/17, 3:05 AM, "Marius Vaitiekunas" <mariusvaitiekunas@xxxxxxxxx> > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > > On Mon, Mar 27, 2017 at 11:17 PM, Peter Maloney > > > <peter.maloney@xxxxxxxxxxxxxxxxxxxx> wrote: > > > > > > I can't guarantee it's the same as my issue, but from that it sounds the > > > same. > > > > > > Jewel 10.2.4, 10.2.5 tested > > > hypervisors are proxmox qemu-kvm, using librbd > > > 3 ceph nodes with mon+osd on each > > > > > > -faster journals, more disks, bcache, rbd_cache, fewer VMs on ceph, iops > > > and bw limits on client side, jumbo frames, etc. all improve/smooth out > > > performance and mitigate the hangs, but don't prevent it. > > > -hangs are usually associated with blocked requests (I set the complaint > > > time to 5s to see them) > > > -hangs are very easily caused by rbd snapshot + rbd export-diff to do > > > incremental backup (one snap persistent, plus one more during backup) > > > -when qemu VM io hangs, I have to kill -9 the qemu process for it to > > > stop. Some broken VMs don't appear to be hung until I try to live > > > migrate them (live migrating all VMs helped test solutions) > > > > > > Finally I have a workaround... disable exclusive-lock, object-map, and > > > fast-diff rbd features (and restart clients via live migrate). > > > (object-map and fast-diff appear to have no effect on dif or export-diff > > > ... so I don't miss them). I'll file a bug at some point (after I move > > > all VMs back and see if it is still stable). And one other user on IRC > > > said this solved the same problem (also using rbd snapshots). > > > > > > And strangely, they don't seem to hang if I put back those features, > > > until a few days later (making testing much less easy...but now I'm very > > > sure removing them prevents the issue) > > > > > > I hope this works for you (and maybe gets some attention from devs too), > > > so you don't waste months like me. > > > > > > > > > On 03/27/17 19:31, Hall, Eric wrote: > > >> In an OpenStack (mitaka) cloud, backed by a ceph cluster (10.2.6 jewel), > > >> using libvirt/qemu (1.3.1/2.5) hypervisors on Ubuntu 14.04.5 compute and > > >> ceph hosts, we occasionally see hung processes (usually during boot, but > > >> otherwise as well), with errors reported in the instance logs as shown > > >> below. Configuration is vanilla, based on openstack/ceph docs. > > >> > > >> Neither the compute hosts nor the ceph hosts appear to be overloaded in > > >> terms of memory or network bandwidth, none of the 67 osds are over 80% full, > > >> nor do any of them appear to be overwhelmed in terms of IO. Compute hosts > > >> and ceph cluster are connected via a relatively quiet 1Gb network, with an > > >> IBoE net between the ceph nodes. Neither network appears overloaded. > > >> > > >> I don’t see any related (to my eye) errors in client or server logs, even > > >> with 20/20 logging from various components (rbd, rados, client, > > >> objectcacher, etc.) I’ve increased the qemu file descriptor limit > > >> (currently 64k... overkill for sure.) > > >> > > >> I “feels” like a performance problem, but I can’t find any capacity issues > > >> or constraining bottlenecks. > > >> > > >> Any suggestions or insights into this situation are appreciated. Thank > > >> you for your time, > > >> -- > > >> Eric > > >> > > >> > > >> [Fri Mar 24 20:30:40 2017] INFO: task jbd2/vda1-8:226 blocked for more > > >> than 120 seconds. > > >> [Fri Mar 24 20:30:40 2017] Not tainted 3.13.0-52-generic #85-Ubuntu > > >> [Fri Mar 24 20:30:40 2017] "echo 0 > > > >> /proc/sys/kernel/hung_task_timeout_secs" disables this message. > > >> [Fri Mar 24 20:30:40 2017] jbd2/vda1-8 D ffff88043fd13180 0 226 > > >> 2 0x00000000 > > >> [Fri Mar 24 20:30:40 2017] ffff88003728bbd8 0000000000000046 > > >> ffff880426900000 ffff88003728bfd8 > > >> [Fri Mar 24 20:30:40 2017] 0000000000013180 0000000000013180 > > >> ffff880426900000 ffff88043fd13a18 > > >> [Fri Mar 24 20:30:40 2017] ffff88043ffb9478 0000000000000002 > > >> ffffffff811ef7c0 ffff88003728bc50 > > >> [Fri Mar 24 20:30:40 2017] Call Trace: > > >> [Fri Mar 24 20:30:40 2017] [<ffffffff811ef7c0>] ? > > >> generic_block_bmap+0x50/0x50 > > >> [Fri Mar 24 20:30:40 2017] [<ffffffff81726d2d>] io_schedule+0x9d/0x140 > > >> [Fri Mar 24 20:30:40 2017] [<ffffffff811ef7ce>] sleep_on_buffer+0xe/0x20 > > >> [Fri Mar 24 20:30:40 2017] [<ffffffff817271b2>] __wait_on_bit+0x62/0x90 > > >> [Fri Mar 24 20:30:40 2017] [<ffffffff811ef7c0>] ? > > >> generic_block_bmap+0x50/0x50 > > >> [Fri Mar 24 20:30:40 2017] [<ffffffff81727257>] > > >> out_of_line_wait_on_bit+0x77/0x90 > > >> [Fri Mar 24 20:30:40 2017] [<ffffffff810ab180>] ? > > >> autoremove_wake_function+0x40/0x40 > > >> [Fri Mar 24 20:30:40 2017] [<ffffffff811f0afa>] > > >> __wait_on_buffer+0x2a/0x30 > > >> [Fri Mar 24 20:30:40 2017] [<ffffffff8128bb4d>] > > >> jbd2_journal_commit_transaction+0x185d/0x1ab0 > > >> [Fri Mar 24 20:30:40 2017] [<ffffffff810755df>] ? > > >> try_to_del_timer_sync+0x4f/0x70 > > >> [Fri Mar 24 20:30:40 2017] [<ffffffff8128fe7d>] kjournald2+0xbd/0x250 > > >> [Fri Mar 24 20:30:40 2017] [<ffffffff810ab140>] ? > > >> prepare_to_wait_event+0x100/0x100 > > >> [Fri Mar 24 20:30:40 2017] [<ffffffff8128fdc0>] ? > > >> commit_timeout+0x10/0x10 > > >> [Fri Mar 24 20:30:40 2017] [<ffffffff8108b5d2>] kthread+0xd2/0xf0 > > >> [Fri Mar 24 20:30:40 2017] [<ffffffff8108b500>] ? > > >> kthread_create_on_node+0x1c0/0x1c0 > > >> [Fri Mar 24 20:30:40 2017] [<ffffffff8173304c>] ret_from_fork+0x7c/0xb0 > > >> [Fri Mar 24 20:30:40 2017] [<ffffffff8108b500>] ? > > >> kthread_create_on_node+0x1c0/0x1c0 > > >> > > >> > > >> > > >> _______________________________________________ > > >> ceph-users mailing list > > >> ceph-users@xxxxxxxxxxxxxx > > >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > > > > > > > > > _______________________________________________ > > > ceph-users mailing list > > > ceph-users@xxxxxxxxxxxxxx > > > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > > > > > > > > > > > > Hi, > > > > > > > > > > > > We are using these settings on hypervisors in openstack: > > > > > > vm.dirty_ratio = 40 > > > > > > vm.dirty_background_ratio = 5 > > > > > > > > > > > > And these on vms: > > > > > > vm.dirty_ratio = 10 > > > > > > vm.dirty_background_ratio = 5 > > > > > > > > > > > > In our case it prevents vms from crashing. > > > > > > > > > > > > -- > > > > > > Marius Vaitiekūnas > > > > > > > > > _______________________________________________ > > > ceph-users mailing list > > > ceph-users@xxxxxxxxxxxxxx > > > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > > > > > > > > > > > -- > > Jason > > > > > > > > -- > Jason > > > > -- Jason _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com