Re: Possible deadlock condition

Sage Weil <sage@xxxxxxxxxxx> · Wed, 20 Jun 2012 15:42:24 -0700 (PDT)

On Wed, 20 Jun 2012, Mandell Degerness wrote:
> The prior thread seems to refer to something fixed in 3.0.X, we are
> running 3.2.18.  Also, in answer to the previous question, we see the
> error on systems running at 82% full and systems running at 5% full
> disks.
> 
> Anyone have any ideas about how to resolve the deadlock?  Do we have
> to configure mysql differently?

I'm not sure that this is related to the previous problems.  I would start 
from square one to diagnose:

 - When you see the hang, are there blocked osd I/O operations?

	cat /sys/kernel/debug/ceph/*/osdc

 - Maybe this is a memory deadlock?  I'm guessing not (if you can 
   log into the system), but it's worth checking.  I think you should see 
   another blocked task in that case.

 - This might be an XFS thing unrelated to the fact that we're running on 
   RBD.  If it's reproducible, can you try on ext4 instead?

sage

> 
> -Mandell
> On Mon, Jun 18, 2012 at 4:34 PM, Dan Mick <dan.mick@xxxxxxxxxxx> wrote:
> > I don't know enough to know if there's a connection, but I do note this
> > prior thread that sounds kinda similar:
> >
> > http://thread.gmane.org/gmane.comp.file-systems.ceph.devel/6574
> >
> >
> >
> > On 06/18/2012 04:08 PM, Mandell Degerness wrote:
> >>
> >> None of the OSDs seem to be more than 82% full. I didn't think we were
> >> running quite that close to the margin, but it is still far from
> >> actually full.
> >>
> >>
> >> On Mon, Jun 18, 2012 at 3:57 PM, Dan Mick<dan.mick@xxxxxxxxxxx>  wrote:
> >>>
> >>> Does the xfs on the OSD have plenty of free space left, or could this be
> >>> an
> >>> allocation deadlock?
> >>>
> >>>
> >>> On 06/18/2012 03:17 PM, Mandell Degerness wrote:
> >>>>
> >>>>
> >>>> Here is, perhaps, a more useful traceback from a different run of
> >>>> tests that we just ran into:
> >>>>
> >>>> Jun 18 17:58:51 node-172-29-0-15 kernel: [242522.680815] INFO: task
> >>>> flush-254:0:29582 blocked for more than 120 seconds.
> >>>> Jun 18 17:58:51 node-172-29-0-15 kernel: [242522.681040] "echo 0>
> >>>> /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> >>>> Jun 18 17:58:51 node-172-29-0-15 kernel: [242522.681458] flush-254:0
> >>>>   D ffff880bd9ca2fc0     0 29582      2 0x00000000
> >>>> Jun 18 17:58:51 node-172-29-0-15 kernel: [242522.681740]
> >>>> ffff88006e51d160 0000000000000046 0000000000000002 ffff88061b362040
> >>>> Jun 18 17:58:51 node-172-29-0-15 kernel: [242522.682173]
> >>>> ffff88006e51d160 00000000000120c0 00000000000120c0 00000000000120c0
> >>>> Jun 18 17:58:51 node-172-29-0-15 kernel: [242522.682659]
> >>>> ffff88006e51dfd8 00000000000120c0 00000000000120c0 ffff88006e51dfd8
> >>>> Jun 18 17:58:51 node-172-29-0-15 kernel: [242522.683088] Call Trace:
> >>>> Jun 18 17:58:51 node-172-29-0-15 kernel: [242522.683302]
> >>>> [<ffffffff81520132>] schedule+0x5a/0x5c
> >>>> Jun 18 17:58:51 node-172-29-0-15 kernel: [242522.683514]
> >>>> [<ffffffff815203e7>] schedule_timeout+0x36/0xe3
> >>>> Jun 18 17:58:51 node-172-29-0-15 kernel: [242522.683784]
> >>>> [<ffffffff8101e0b2>] ? physflat_send_IPI_mask+0xe/0x10
> >>>> Jun 18 17:58:51 node-172-29-0-15 kernel: [242522.683999]
> >>>> [<ffffffff8101a237>] ? native_smp_send_reschedule+0x46/0x48
> >>>> Jun 18 17:58:51 node-172-29-0-15 kernel: [242522.684219]
> >>>> [<ffffffff811e0071>] ? list_move_tail+0x27/0x2c
> >>>> Jun 18 17:58:51 node-172-29-0-15 kernel: [242522.684432]
> >>>> [<ffffffff81520d13>] __down_common+0x90/0xd4
> >>>> Jun 18 17:58:51 node-172-29-0-15 kernel: [242522.684708]
> >>>> [<ffffffff811e1120>] ? _xfs_buf_find+0x17f/0x210
> >>>> Jun 18 17:58:51 node-172-29-0-15 kernel: [242522.684925]
> >>>> [<ffffffff81520dca>] __down+0x1d/0x1f
> >>>> Jun 18 17:58:51 node-172-29-0-15 kernel: [242522.685139]
> >>>> [<ffffffff8105db4e>] down+0x2d/0x3d
> >>>> Jun 18 17:58:51 node-172-29-0-15 kernel: [242522.685350]
> >>>> [<ffffffff811e0f68>] xfs_buf_lock+0x76/0xaf
> >>>> Jun 18 17:58:51 node-172-29-0-15 kernel: [242522.685565]
> >>>> [<ffffffff811e1120>] _xfs_buf_find+0x17f/0x210
> >>>> Jun 18 17:58:51 node-172-29-0-15 kernel: [242522.685836]
> >>>> [<ffffffff811e13b6>] xfs_buf_get+0x2a/0x177
> >>>> Jun 18 17:58:51 node-172-29-0-15 kernel: [242522.686052]
> >>>> [<ffffffff811e19f6>] xfs_buf_read+0x1f/0xca
> >>>> Jun 18 17:58:51 node-172-29-0-15 kernel: [242522.686270]
> >>>> [<ffffffff8122a0b7>] xfs_trans_read_buf+0x205/0x308
> >>>> Jun 18 17:58:51 node-172-29-0-15 kernel: [242522.686490]
> >>>> [<ffffffff81205e01>] xfs_btree_read_buf_block.clone.22+0x4f/0xa7
> >>>> Jun 18 17:58:51 node-172-29-0-15 kernel: [242522.687015]
> >>>> [<ffffffff8122a3ee>] ? xfs_trans_log_buf+0xb2/0xc1
> >>>> Jun 18 17:58:51 node-172-29-0-15 kernel: [242522.687232]
> >>>> [<ffffffff81205edd>] xfs_btree_lookup_get_block+0x84/0xac
> >>>> Jun 18 17:58:51 node-172-29-0-15 kernel: [242522.687449]
> >>>> [<ffffffff81208e83>] xfs_btree_lookup+0x12b/0x3dc
> >>>> Jun 18 17:58:51 node-172-29-0-15 kernel: [242522.687721]
> >>>> [<ffffffff811f6bb2>] ? xfs_alloc_vextent+0x447/0x469
> >>>> Jun 18 17:58:51 node-172-29-0-15 kernel: [242522.687939]
> >>>> [<ffffffff811fd171>] xfs_bmbt_lookup_eq+0x1f/0x21
> >>>> Jun 18 17:58:51 node-172-29-0-15 kernel: [242522.688156]
> >>>> [<ffffffff811ffa88>] xfs_bmap_add_extent_delay_real+0x5b5/0xfec
> >>>> Jun 18 17:58:51 node-172-29-0-15 kernel: [242522.688378]
> >>>> [<ffffffff810f155b>] ? kmem_cache_alloc+0x87/0xf3
> >>>> Jun 18 17:58:51 node-172-29-0-15 kernel: [242522.688650]
> >>>> [<ffffffff81204c40>] ? xfs_bmbt_init_cursor+0x3f/0x107
> >>>> Jun 18 17:58:51 node-172-29-0-15 kernel: [242522.688867]
> >>>> [<ffffffff81201160>] xfs_bmapi_allocate+0x1f6/0x23a
> >>>> Jun 18 17:58:51 node-172-29-0-15 kernel: [242522.689084]
> >>>> [<ffffffff812185bd>] ? xfs_iext_bno_to_irec+0x95/0xb9
> >>>> Jun 18 17:58:51 node-172-29-0-15 kernel: [242522.689301]
> >>>> [<ffffffff81203414>] xfs_bmapi_write+0x32d/0x5a2
> >>>> Jun 18 17:58:51 node-172-29-0-15 kernel: [242522.689519]
> >>>> [<ffffffff811e99e4>] xfs_iomap_write_allocate+0x1a5/0x29f
> >>>> Jun 18 17:58:51 node-172-29-0-15 kernel: [242522.689797]
> >>>> [<ffffffff811df12a>] xfs_map_blocks+0x13e/0x1dd
> >>>> Jun 18 17:58:51 node-172-29-0-15 kernel: [242522.690016]
> >>>> [<ffffffff811dfbff>] xfs_vm_writepage+0x24e/0x410
> >>>> Jun 18 17:58:51 node-172-29-0-15 kernel: [242522.690233]
> >>>> [<ffffffff810bde1e>] __writepage+0x17/0x30
> >>>> Jun 18 17:58:51 node-172-29-0-15 kernel: [242522.690446]
> >>>> [<ffffffff810be6ed>] write_cache_pages+0x276/0x3c8
> >>>> Jun 18 17:58:51 node-172-29-0-15 kernel: [242522.690693]
> >>>> [<ffffffff810bde07>] ? set_page_dirty+0x60/0x60
> >>>> Jun 18 17:58:51 node-172-29-0-15 kernel: [242522.690908]
> >>>> [<ffffffff810be884>] generic_writepages+0x45/0x5c
> >>>> Jun 18 17:58:51 node-172-29-0-15 kernel: [242522.691123]
> >>>> [<ffffffff811defcb>] xfs_vm_writepages+0x4d/0x54
> >>>> Jun 18 17:58:51 node-172-29-0-15 kernel: [242522.691337]
> >>>> [<ffffffff810bf832>] do_writepages+0x21/0x2a
> >>>> Jun 18 17:58:51 node-172-29-0-15 kernel: [242522.691552]
> >>>> [<ffffffff811218f5>] writeback_single_inode+0x12a/0x2cc
> >>>> Jun 18 17:58:51 node-172-29-0-15 kernel: [242522.691800]
> >>>> [<ffffffff81121d92>] writeback_sb_inodes+0x174/0x215
> >>>> Jun 18 17:58:51 node-172-29-0-15 kernel: [242522.692016]
> >>>> [<ffffffff81122185>] __writeback_inodes_wb+0x78/0xb9
> >>>> Jun 18 17:58:51 node-172-29-0-15 kernel: [242522.692231]
> >>>> [<ffffffff811224b5>] wb_writeback+0x136/0x22a
> >>>> Jun 18 17:58:51 node-172-29-0-15 kernel: [242522.692444]
> >>>> [<ffffffff810becd1>] ? determine_dirtyable_memory+0x1d/0x26
> >>>> Jun 18 17:58:51 node-172-29-0-15 kernel: [242522.692692]
> >>>> [<ffffffff81122d1e>] wb_do_writeback+0x19c/0x1b7
> >>>> Jun 18 17:58:51 node-172-29-0-15 kernel: [242522.692907]
> >>>> [<ffffffff81122dc5>] bdi_writeback_thread+0x8c/0x20f
> >>>> Jun 18 17:58:51 node-172-29-0-15 kernel: [242522.693122]
> >>>> [<ffffffff81122d39>] ? wb_do_writeback+0x1b7/0x1b7
> >>>> Jun 18 17:58:51 node-172-29-0-15 kernel: [242522.693336]
> >>>> [<ffffffff81122d39>] ? wb_do_writeback+0x1b7/0x1b7
> >>>> Jun 18 17:58:51 node-172-29-0-15 kernel: [242522.693553]
> >>>> [<ffffffff8105911d>] kthread+0x82/0x8a
> >>>> Jun 18 17:58:51 node-172-29-0-15 kernel: [242522.693803]
> >>>> [<ffffffff81523c34>] kernel_thread_helper+0x4/0x10
> >>>> Jun 18 17:58:51 node-172-29-0-15 kernel: [242522.694018]
> >>>> [<ffffffff8105909b>] ? kthread_worker_fn+0x13b/0x13b
> >>>> Jun 18 17:58:51 node-172-29-0-15 kernel: [242522.694232]
> >>>> [<ffffffff81523c30>] ? gs_change+0xb/0xb
> >>>>
> >>>>
> >>>> On Mon, Jun 18, 2012 at 11:37 AM, Mandell Degerness
> >>>> <mandell@xxxxxxxxxxxxxxx>    wrote:
> >>>>>
> >>>>>
> >>>>> We've been seeing random issues of apparent deadlocks.  We are running
> >>>>> ceph 0.47 on kernel 3.2.18.  OSDs are running on XFS file system.
> >>>>> mysqld (which ran into the particular problems in the attached kernel
> >>>>> log) is running on an RBD with XFS (mounted on a system which includes
> >>>>> OSDs).  We have sync_fs, and gcc ver 4.5.3-r2.  The mysqld process in
> >>>>> both instances returned an error to the calling process.
> >>>>>
> >>>>> Regards,
> >>>>> Mandell Degerness
> >>>>
> >>>>
> >>>> --
> >>>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> >>>> the body of a message to majordomo@xxxxxxxxxxxxxxx
> >>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@xxxxxxxxxxxxxxx
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
>