Re: [RFC PATCH] xfs: Fix "use after free" of intent items

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Jan 11, 2018 at 06:22:23PM +0530, Chandan Rajendra wrote:
> generic/388 can cause the following "use after free" error to occur,
> 
>  =============================================================================
>  BUG xfs_efi_item (Not tainted): Poison overwritten
>  -----------------------------------------------------------------------------
> 
>  Disabling lock debugging due to kernel taint
>  INFO: 0x00000000292c4bd4-0x00000000292c4bd4. First byte 0x6a instead of 0x6b
>  INFO: Allocated in .kmem_zone_alloc+0xcc/0x190 age=79 cpu=0 pid=12436
>  	.__slab_alloc+0x54/0x80
>  	.kmem_cache_alloc+0x124/0x350
>  	.kmem_zone_alloc+0xcc/0x190
>  	.xfs_efi_init+0x48/0xf0
>  	.xfs_extent_free_create_intent+0x40/0x130
>  	.xfs_defer_intake_work+0x74/0x1e0
>  	.xfs_defer_finish+0xac/0x5c0
>  	.xfs_itruncate_extents+0x170/0x590
>  	.xfs_inactive_truncate+0xcc/0x170
>  	.xfs_inactive+0x1d8/0x2f0
>  	.xfs_fs_destroy_inode+0xe4/0x3d0
>  	.destroy_inode+0x68/0xb0
>  	.do_unlinkat+0x1e8/0x390
>  	system_call+0x58/0x6c
>  INFO: Freed in .xfs_efi_item_free+0x44/0x80 age=79 cpu=0 pid=12436
>  	.kmem_cache_free+0x120/0x2b0
>  	.xfs_efi_item_free+0x44/0x80
>  	.xfs_trans_free_items+0xd4/0x130
>  	.__xfs_trans_commit+0xd0/0x350
>  	.xfs_trans_roll+0x4c/0x90
>  	.xfs_defer_trans_roll+0xa4/0x2b0
>  	.xfs_defer_finish+0xb8/0x5c0
>  	.xfs_itruncate_extents+0x170/0x590
>  	.xfs_inactive_truncate+0xcc/0x170
>  	.xfs_inactive+0x1d8/0x2f0
>  	.xfs_fs_destroy_inode+0xe4/0x3d0
>  	.destroy_inode+0x68/0xb0
>  	.do_unlinkat+0x1e8/0x390
>  	system_call+0x58/0x6c
> 
> This happens due to the following interaction,
> 1. xfs_defer_finish() creates "extent free" intent item and adds it to the
>    list at xfs_trans->t_items.
> 2. xfs_defer_trans_roll() invokes __xfs_trans_commit(). Here, if the
>    XFS_MOUNT_FS_SHUTDOWN flag is set, we invoke io_unlock() operation of the
>    corresponding log item. For "extent free" log items xfs_efi_item_unlock()
>    gets invoked which frees the xfs_efi_log_item.
> 3. xfs_defer_trans_roll() then invokes xfs_defer_trans_abort(). Since the
>    xfs_defer_pending->dfp_intent is still set to the "extent free" intent
>    item, we invoke xfs_extent_free_abort_intent(). This accesses the
>    previously freed xfs_efi_log_item to decrement the ref count.
> 
> This commit fixes the bug by changing xfs_efi_item_unlock() to invoke
> xfs_efi_release() instead of xfs_efi_item_free(). The xfs_efi_log_item gets
> freed when xfs_extent_free_abort_intent() invokes xfs_efi_release() once again
> i.e. its refcount becomes zero and hence xfs_efi_item_free() is invoked.
> 

Interesting.. trying to refamiliarize myself with the expected reference
counting rules here, I see that this goes back to the old
xfs_bmap_finish() mechanism. That guy would grab an EFI, log it and then
attempt to roll/commit the transaction. IIRC, the direct release covers
the case where that tx commit fails. That is essentially equivalent to a
tx cancel in this example, so the tx is freed and xfs_bmap_finish() is
done (returns with an error). The ->iop_unlock() frees the EFI in this
case explicitly because 1.) the refcount is initialized to 2, one for
the log and one for the EFD and 2.) the EFD is never created in this
situation.

Given that, I suspect this code was correct with the bmap_finish() code
and became a problem sometime later. Have you done any investigation to
confirm/deny whether this is a regression since v4.7 or so?

Moving along to the current code... xfs_defer_finish() creates the EFI,
logs it, etc. and moves on to xfs_trans_roll(). If the tx commit fails,
we're in a similar abort/cancel situation as before. The deferred ops code
tracks the EFI within the dfops data structures, however, so it calls
into the ->abort_intent() callout to drop the other reference and we hit
this problem.

AFAICT, the abort handler covers the only situation where an intent
would be left hanging around with an extra reference for the done item
but without the done item in existence yet. Given that, I think it's
safe to change the intent log state machine handling as this patch does.
The caveat is that with this change it is now required that the
committer of a failed transaction is always responsible for releasing
the reference on intent items for the associated done items that are not
going to be created (where it wasn't required before). IOW, I suspect we
could just as safely remove the abort handling code to resolve the
problem.  Since dfops seems to handle the situation correctly, I don't
really feel strongly either way.

I do think that this patch should at least:

- document the historical context in the commit log, provided the above
  can be verified (i.e., why/since when is this a problem?)
- update the comments in the intent item unlock handlers and probably
  the dfops code to make sure the reference counting rules are clear
- include the same updates for other applicable items in the same patch

... and finally, this should be tested against a frequent shutdown
workload if not done so already. E.g., run 'while [ true ]; do fsstress;
umount; mount; done' in one thread and 'while [ true ]; do sleep N;
xfs_io -xc "shutdown -f" ...; done' in another and verify we don't
explode/hang.

Brian

> Reported-by: Christoph Hellwig <hch@xxxxxx>
> Signed-off-by: Chandan Rajendra <chandan@xxxxxxxxxxxxxxxxxx>
> ---
> 
> The above mentioned "use after free" occurs with other log items.
> (e.g. xfs_rui_item). If the below fix is found to be correct, I will
> apply it to other affected log items as well and post a new patch.
> 
>  fs/xfs/xfs_extfree_item.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/fs/xfs/xfs_extfree_item.c b/fs/xfs/xfs_extfree_item.c
> index 64da906..2119441 100644
> --- a/fs/xfs/xfs_extfree_item.c
> +++ b/fs/xfs/xfs_extfree_item.c
> @@ -151,7 +151,7 @@ xfs_efi_item_unlock(
>  	struct xfs_log_item	*lip)
>  {
>  	if (lip->li_flags & XFS_LI_ABORTED)
> -		xfs_efi_item_free(EFI_ITEM(lip));
> +		xfs_efi_release(EFI_ITEM(lip));
>  }
>  
>  /*
> -- 
> 2.9.5
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> the body of a message to majordomo@xxxxxxxxxxxxxxx
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [XFS Filesystem Development (older mail)]     [Linux Filesystem Development]     [Linux Audio Users]     [Yosemite Trails]     [Linux Kernel]     [Linux RAID]     [Linux SCSI]


  Powered by Linux