2011/7/13 Josef Bacik <josef@xxxxxxxxxx>: > On 07/12/2011 11:20 AM, Christian Brunner wrote: >> 2011/6/7 Josef Bacik <josef@xxxxxxxxxx>: >>> On 06/06/2011 09:39 PM, Miao Xie wrote: >>>> On fri, 03 Jun 2011 14:46:10 -0400, Josef Bacik wrote: >>>>> I got a lot of these when running stress.sh on my test box >>>>> >>>>> >>>>> >>>>> This is because use_block_rsv() is having to do a >>>>> reserve_metadata_bytes(), which shouldn't happen as we should have >>>>> reserved enough space for those operations to complete. This is >>>>> happening because use_block_rsv() will call get_block_rsv(), which if >>>>> root->ref_cows is set (which is the case on all fs roots) we will use >>>>> trans->block_rsv, which will only have what the current transaction >>>>> starter had reserved. >>>>> >>>>> What needs to be done instead is we need to have a block reserve that >>>>> any reservation that is done at create time for these inodes is migrated >>>>> to this special reserve, and then when you run the delayed inode items >>>>> stuff you set trans->block_rsv to the special block reserve so the >>>>> accounting is all done properly. >>>>> >>>>> This is just off the top of my head, there may be a better way to do it, >>>>> I've not actually looked that the delayed inode code at all. >>>>> >>>>> I would do this myself but I have a ever increasing list of shit to do >>>>> so will somebody pick this up and fix it please? Thanks, >>>> >>>> Sorry, it's my miss. >>>> I forgot to set trans->block_rsv to global_block_rsv, since we have migrated >>>> the space from trans_block_rsv to global_block_rsv. >>>> >>>> I'll fix it soon. >>>> >>> >>> There is another problem, we're failing xfstest 204. I tried making >>> reserve_metadata_bytes commit the transaction regardless of whether or >>> not there were pinned bytes but the test just hung there. Usually it >>> takes 7 seconds to run and I ctrl+c'ed it after a couple of minutes. >>> 204 just creates a crap ton of files, which is what is killing us. >>> There needs to be a way to start flushing delayed inode items so we can >>> reclaim the space they are holding onto so we don't get enospc, and it >>> needs to be better than just committing the transaction because that is >>> dog slow. Thanks, >>> >>> Josef >> >> Is there a solution for this? >> >> I'm running a 2.6.38.8 kernel with all the btrfs patches from 3.0rc7 >> (except the pluging). When starting a ceph rebuild on the btrfs >> volumes I get a lot of warnings from block_rsv_use_bytes in >> use_block_rsv: >> > > Ok I think I've got this nailed down. Will you run with this patch and make sure the warnings go away? Thanks, I'm sorry, I'm still getting a lot of warnings like the one below. I've also noticed, that I'm not getting these messages when the free_space_cache is disabled. Christian [ 697.398097] ------------[ cut here ]------------ [ 697.398109] WARNING: at fs/btrfs/extent-tree.c:5693 btrfs_alloc_free_block+0x1f8/0x360 [btrfs]() [ 697.398111] Hardware name: ProLiant DL180 G6 [ 697.398112] Modules linked in: btrfs zlib_deflate libcrc32c bonding ipv6 serio_raw pcspkr ghes hed iTCO_wdt iTCO_vendor_support i7core_edac edac_core ixgbe dca mdio iomemory_vsl(P) hpsa squashfs usb_storage [last unloaded: scsi_wait_scan] [ 697.398122] Pid: 6591, comm: btrfs-freespace Tainted: P W 3.0.0-1.fits.1.el6.x86_64 #1 [ 697.398124] Call Trace: [ 697.398128] [<ffffffff810630af>] warn_slowpath_common+0x7f/0xc0 [ 697.398131] [<ffffffff8106310a>] warn_slowpath_null+0x1a/0x20 [ 697.398142] [<ffffffffa022cb88>] btrfs_alloc_free_block+0x1f8/0x360 [btrfs] [ 697.398156] [<ffffffffa025ae08>] ? read_extent_buffer+0xd8/0x1d0 [btrfs] [ 697.398316] [<ffffffffa021d112>] split_leaf+0x142/0x8c0 [btrfs] [ 697.398325] [<ffffffffa021629b>] ? generic_bin_search+0x19b/0x210 [btrfs] [ 697.398334] [<ffffffffa0218a1a>] ? btrfs_leaf_free_space+0x8a/0xe0 [btrfs] [ 697.398344] [<ffffffffa021df63>] btrfs_search_slot+0x6d3/0x7a0 [btrfs] [ 697.398355] [<ffffffffa0230942>] btrfs_csum_file_blocks+0x632/0x830 [btrfs] [ 697.398369] [<ffffffffa025c03a>] ? clear_extent_bit+0x17a/0x440 [btrfs] [ 697.398382] [<ffffffffa023c009>] add_pending_csums+0x49/0x70 [btrfs] [ 697.398395] [<ffffffffa023ef5d>] btrfs_finish_ordered_io+0x22d/0x360 [btrfs] [ 697.398408] [<ffffffffa023f0dc>] btrfs_writepage_end_io_hook+0x4c/0xa0 [btrfs] [ 697.398422] [<ffffffffa025c4fb>] end_bio_extent_writepage+0x13b/0x180 [btrfs] [ 697.398425] [<ffffffff81558b5b>] ? schedule_timeout+0x17b/0x2e0 [ 697.398436] [<ffffffffa02336d9>] ? end_workqueue_fn+0xe9/0x130 [btrfs] [ 697.398439] [<ffffffff8118f24d>] bio_endio+0x1d/0x40 [ 697.398451] [<ffffffffa02336e4>] end_workqueue_fn+0xf4/0x130 [btrfs] [ 697.398464] [<ffffffffa02671de>] worker_loop+0x13e/0x540 [btrfs] [ 697.398477] [<ffffffffa02670a0>] ? btrfs_queue_worker+0x2d0/0x2d0 [btrfs] [ 697.398490] [<ffffffffa02670a0>] ? btrfs_queue_worker+0x2d0/0x2d0 [btrfs] [ 697.398493] [<ffffffff81085896>] kthread+0x96/0xa0 [ 697.398496] [<ffffffff81563844>] kernel_thread_helper+0x4/0x10 [ 697.398499] [<ffffffff81085800>] ? kthread_worker_fn+0x1a0/0x1a0 [ 697.398502] [<ffffffff81563840>] ? gs_change+0x13/0x13 [ 697.398503] ---[ end trace 8c77269b0de3f0fb ]--- [ 697.432225] ------------[ cut here ]------------ -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html