I tried changing the locking in
File : xfs_sync.c
Function : int xfs_quiesce_data(struct xfs_mount *mp)
/* write superblock and hoover up shutdown errors */
- error = xfs_sync_fsdata(mp, SYNC_WAIT);
+ error = xfs_sync_fsdata(mp,SYNC_TRYLOCK);
- error = xfs_sync_fsdata(mp, SYNC_WAIT);
+ error = xfs_sync_fsdata(mp,SYNC_TRYLOCK);
This change was just out of curiousity, I am trying to reproduce the hang with this, but didn't observe one in last many iterations.
Also, I am looking at possible side effects for the same change. Please let me know about this.
To add to this, the code area in doubt according to me:
fs/xfs/xfs_buf_item.c
Function: void xfs_buf_iodone_callbacks( xfs_buf_t *bp), in this function,
XFS_BUF_SET_BRELSE_FUNC(bp,xfs_buf_error_relse); xfs_buf_error_relse is registered as callback, which will unlock the lock held, but I really doubt if the callback is getting called. Still analyzing this code area.
Please update me if this is the right direction.
Thanks & Regards,
Amit Sahrawat
On Wed, Dec 22, 2010 at 12:11 PM, Amit Sahrawat <amit.sahrawat83@xxxxxxxxx> wrote:
Extremely sorry for inconvenience, will take care about posting complete details in future.Test Case :cp Complex directory structure(large no of files and directories) to my XFS formatted partition:cp -ar /LibExe /usb/sda2Unplug the USB while the COPY is in progress.Storage: USB Flash, USB HDD (Both)Kernel: 2.6.34Target: MIPSLOGS:usb 2-1: USB disconnect, address 7
Device sda2, XFS metadata write error block 0x0 in sda2
xfs_force_shutdown(sda2,0x1) called from line 1004 of file fs/xfs/linux-2.6/xfs_buf.c. Return address = 0x801cc294
Filesystem "sda2": I/O Error Detected. Shutting down filesystem: sda2
Please umount the filesystem, and rectify the problem(s)Plug in USB Port1
sd 7:0:0:0: [sdb] Attached SCSI diskFilesystem "sda2": xfs_log_force: error 5 returned.Filesystem "sda2": xfs_log_force: error 5 returned.
Filesystem "sda2": xfs_log_force: error 5 returned.Filesystem "sda2": xfs_log_force: error 5 returned.INFO: task usb_mount:1858 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
usb_mount D [84a42440] 8032d62c 0 1858 1816 (user thread)
Stack : 00000107 00000000 85e7be80 00030002 84a425c8 8032d62c 7fffffff 84a42440
00000002 8496e200 00000001 00000000 85e7bf00 85e7bef8 7fa2f2e0 8032d62c
00000001 801d69a8 85e7bd40 801d6b34 85e7bd4c 8032dc6c 00000000 801dbc80
85e7be80 864315a8 8662c980 00000001 00000742 00000000 00000000 84b85800
85e7bd90 801d6cc0 7fffffff 84a42440 00000002 8032ee74 00000081 804158a0
...
Call Trace:
[<8032d574>] __schedule+0x618/0x6b8 from[<8032d62c>] schedule+0x18/0x3c
[<8032d62c>] schedule+0x18/0x3c from[<8032dc6c>] schedule_timeout+0x2c/0x1c0
[<8032dc6c>] schedule_timeout+0x2c/0x1c0 from[<8032ee74>] __down+0x8c/0xdc
[<8032ee74>] __down+0x8c/0xdc from[<8004500c>] down+0x40/0x88
[<8004500c>] down+0x40/0x88 from[<801ca838>] xfs_buf_lock+0xcc/0x15c
[<801ca838>] xfs_buf_lock+0xcc/0x15c from[<801b71a0>] xfs_getsb+0x38/0x54
[<801b71a0>] xfs_getsb+0x38/0x54 from[<801d64a8>] xfs_sync_fsdata+0x7c/0x154
[<801d64a8>] xfs_sync_fsdata+0x7c/0x154 from[<801d7284>] xfs_quiesce_data+0x34/0x60
[<801d7284>] xfs_quiesce_data+0x34/0x60 from[<801d3514>] xfs_fs_sync_fs+0x30/0xec
[<801d3514>] xfs_fs_sync_fs+0x30/0xec from[<800ba09c>] __fsync_super+0xa4/0xc8
[<800ba09c>] __fsync_super+0xa4/0xc8 from[<800ba0d4>] fsync_super+0x14/0x28
[<800ba0d4>] fsync_super+0x14/0x28 from[<800ba4a0>] generic_shutdown_super+0x34/0x190
[<800ba4a0>] generic_shutdown_super+0x34/0x190 from[<800ba654>] kill_block_super+0x58/0x80
[<800ba654>] kill_block_super+0x58/0x80 from[<800bac6c>] deactivate_super+0x7c/0x110
[<800bac6c>] deactivate_super+0x7c/0x110 from[<800d2bbc>] sys_umount+0x310/0x358
[<800d2bbc>] sys_umount+0x310/0x358 from[<8000ff44>] stack_done+0x20/0x3c-------------------------------------------------------------------------------------
Filesystem "sda2": xfs_log_force: error 5 returned.
Please let me know in case more information is needed.Thanks & Regards,Amit SahrawatOn Wed, Dec 22, 2010 at 11:32 AM, Dave Chinner <david@xxxxxxxxxxxxx> wrote:
Please make sure you paste stack traces cleanly in your emails so weOn Wed, Dec 22, 2010 at 11:05:26AM +0530, Amit Sahrawat wrote:
> Hi,
> I am encountering hang of XFS filesystem, please find the logs as given
> below:
> INFO: task usb_mount:1858 blocked for more than 120 seconds.
> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> usb_mount D [84a42440] 8032d62c 0 1858
> 1816 (user thread)
> Stack : 00000107 00000000 85e7be80 00030002 84a425c8 8032d62c 7fffffff
> 84a42440
> 00000002 8496e200 00000001 00000000 85e7bf00 85e7bef8 7fa2f2e0
> 8032d62c
> 00000001 801d69a8 85e7bd40 801d6b34 85e7bd4c 8032dc6c 00000000
> 801dbc80
> 85e7be80 864315a8 8662c980 00000001 00000742 00000000 00000000
> 84b85800
> 85e7bd90 801d6cc0 7fffffff 84a42440 00000002 8032ee74 00000081
> 804158a0
> ...
> Call Trace:
> [<8032d574>] __schedule+0x618/0x6b8 from[<8032d62c>] schedule+0x18/0x3c
> [<8032d62c>] schedule+0x18/0x3c from[<8032dc6c>] schedule_timeout+0x2c/0x1c0
> [<8032dc6c>] schedule_timeout+0x2c/0x1c0 from[<8032ee74>] __down+0x8c/0xdc
> [<8032ee74>] __down+0x8c/0xdc from[<8004500c>] down+0x40/0x88
> [<8004500c>] down+0x40/0x88 from[<801ca838>] xfs_buf_lock+0xcc/0x15c
> [<801ca838>] xfs_buf_lock+0xcc/0x15c from[<801b71a0>] xfs_getsb+0x38/0x54
> [<801b71a0>] xfs_getsb+0x38/0x54 from[<801d64a8>] xfs_sync_fsdata+0x7c/0x154
> [<801d64a8>] xfs_sync_fsdata+0x7c/0x154 from[<801d7284>]
> xfs_quiesce_data+0x34/0x60
> [<801d7284>] xfs_quiesce_data+0x34/0x60 from[<801d3514>]
> xfs_fs_sync_fs+0x30/0xec
> [<801d3514>] xfs_fs_sync_fs+0x30/0xec from[<800ba09c>]
> __fsync_super+0xa4/0xc8
> [<800ba09c>] __fsync_super+0xa4/0xc8 from[<800ba0d4>] fsync_super+0x14/0x28
> [<800ba0d4>] fsync_super+0x14/0x28 from[<800ba4a0>]
> generic_shutdown_super+0x34/0x190
> [<800ba4a0>] generic_shutdown_super+0x34/0x190 from[<800ba654>]
> kill_block_super+0x58/0x80
> [<800ba654>] kill_block_super+0x58/0x80 from[<800bac6c>]
> deactivate_super+0x7c/0x110
> [<800bac6c>] deactivate_super+0x7c/0x110 from[<800d2bbc>]
> sys_umount+0x310/0x358
> [<800d2bbc>] sys_umount+0x310/0x358 from[<8000ff44>] stack_done+0x20/0x3c
can read them easily.
--What kernel? What did you do to produce the error? What is the output
> After reboot it works fine, but during this state XFS does not works no
> operation.
of "echo w > /proc/sysrq-trigger"? Do you have a repeatable test
case? What sort of storage are you using? Were there any IO errors
before the hang? etc, etc, etc....
--
For future reference, when you are reporting a problem you need to
be specific about what you were doing to cause the problem you are
reporting. Describe your kernel, your storage, your test case, any
errors that occurred before the problem you are reporting, etc.
We need this information to make any sense of your bug report, but
I'm getting tired of having to ask for it every time you report a
problem. The more information you put in your bug report, the more
likely we are to be able to help you. We don't have unlimited
amounts of time (or patience) to drag all the basic details of your
problem out of you over 3 or 4 emails, so including it up front will
help a lot....
Cheers,
Dave.
--
Dave Chinner
david@xxxxxxxxxxxxx
_______________________________________________ xfs mailing list xfs@xxxxxxxxxxx http://oss.sgi.com/mailman/listinfo/xfs