Hi! I'm trying to come up with a series of ramblings that may or may
not be useful in a mailing-list context, with the idea that one bug
report might be good, the next might be me thinking aloud with data in
hand because I know something's wrong but can't put my finger on it. An
ex-girlfriend saw the movie "Rain Man" years ago pointed to the screen
and said, "Do you see that guy? That's you!" If only I could be so
smart...or act as well as Dustin Hoffman. The noisy thinking is there,
just not the brilliant insights...
This report is to pass on a kernel lock detector message that might be
reproducible under a certain family of tests. generic/230 may not be at
fault, it's just where the detector went off.
It seems like in the few times the detector has gone off lately, it does
so at the same instant as I'm doing some very boring operation on a
different partition at the same time, such as reloading a file in vi, or
piping something to less to read it. Some folks have been working on
tty stuff lately for the 3.8 kernels at least--making great improvements
overall--but there seems to be no tty hints in this message.
The kernel, AFAIK, to be a git Linux with v3.9.0 + this weekend's
xfs-oss checked out, with the following patches applied:
[PATCH v2] xfs: fix assertion failure in xfs_vm_write_failed()
[PATCH] xfs: fix s_max_bytes to MAX_LFS_FILESIZE if needed
[PATCH] xfs: don't return 0 if generic_segment_checks() find nothing
[PATCH 1/2] xfs: fix sub-page blocksize data integrity writes
[PATCH 2/2] xfs: fix rounding in xfs_free_file_space
[PATCH v3 1/2] xfs: Remove XFS_MOUNT_RETERR
[PATCH v3 2/2] xfs: Don't keep silent if sunit/swidth can not be changed
via mount
There shouldn't be a need to apply these patches right away. I'm just
providing context.
Computer is a Pentium 733 with memory lowered to 160 MB for low-memory
testing. It uses the standard VGA console, which can contribute to such
issues but not as much as using a DRM framebuffer console.
Thanks!
Michael
[Earlier tests are shown only to provide sequence.]
FSTYP -- xfs (debug)
PLATFORM -- Linux/i686 oldsvrhw 3.9.0+
MKFS_OPTIONS -- -f -llogdev=/dev/sda7 -bsize=4096 /dev/sdb6
MOUNT_OPTIONS -- -ologdev=/dev/sda7 /dev/sdb6 /mnt/xfstests-scratch
xfs/168 [not run] Assuming DMAPI modules are not loaded
generic/053 10s
xfs/043 [not run] No dump tape specified
generic/099 [not run] not suitable for this OS: Linux
xfs/170 47s
xfs/116 3s
generic/020 29s
xfs/175 [not run] Assuming DMAPI modules are not loaded
xfs/066 8s
xfs/037 [not run] No dump tape specified
xfs/292 - output mismatch (see /var/lib/xfstests/results/xfs/292.out.bad)
--- tests/xfs/292.out 2013-05-08 12:40:14.635752692 -0400
+++ /var/lib/xfstests/results/xfs/292.out.bad 2013-05-08
16:35:33.894218930 -0400
@@ -1,5 +1,5 @@
QA output created by 292
mkfs.xfs without geometry
-meta-data=FILENAME isize=256 agcount=4, agsize=16777216 blks
+meta-data=FILENAME isize=256 agcount=4, agsize=16777216 blks
mkfs.xfs with cmdline geometry
-meta-data=FILENAME isize=256 agcount=16, agsize=4194304 blks
+meta-data=FILENAME isize=256 agcount=16, agsize=4194304 blks
...
(Run 'diff -u tests/xfs/292.out
/var/lib/xfstests/results/xfs/292.out.bad' to see the entire diff)
xfs/086 195s
xfs/293 16s
generic/308 2s
xfs/095 [not run] not suitable for this OS: Linux
xfs/096 28s
xfs/022 [not run] No dump tape specified
generic/260 [not run] FITRIM not supported on /dev/sdb6
generic/247 101s
generic/235 - output mismatch (see
/var/lib/xfstests/results/generic/235.out.bad)
--- tests/generic/235.out 2013-05-08 12:39:55.017626952 -0400
+++ /var/lib/xfstests/results/generic/235.out.bad 2013-05-08
16:42:10.527639188 -0400
@@ -15,7 +15,7 @@
fsgqa -- 0 0 0 1 0 0
-touch: cannot touch `SCRATCH_MNT/failed': Read-only file system
+touch: cannot touch 'SCRATCH_MNT/failed': Read-only file system
*** Report for user quotas on device SCRATCH_DEV
Block grace time: 7days; Inode grace time: 7days
...
(Run 'diff -u tests/generic/235.out
/var/lib/xfstests/results/generic/235.out.bad' to see the entire diff)
xfs/072 7s
xfs/180 441s
xfs/283 25s
xfs/048 1s
generic/076 8s
generic/236 3s
generic/230
=============================================
[ INFO: possible recursive locking detected ]
3.9.0+ #3 Not tainted
---------------------------------------------
setquota/28368 is trying to acquire lock:
(sb_internal){++++.?}, at: [<c11e8846>] xfs_trans_alloc+0x26/0x50
but task is already holding lock:
(sb_internal){++++.?}, at: [<c11e8846>] xfs_trans_alloc+0x26/0x50
other info that might help us debug this:
Possible unsafe locking scenario:
CPU0
----
lock(sb_internal);
lock(sb_internal);
*** DEADLOCK ***
May be due to missing lock nesting notation
3 locks held by setquota/28368:
#0: (&type->s_umount_key#20){++++.+}, at: [<c10c660a>]
get_super+0x7a/0xc0
#1: (sb_internal){++++.?}, at: [<c11e8846>] xfs_trans_alloc+0x26/0x50
#2: (&qinf->qi_quotaofflock){+.+...}, at: [<c11fa44a>]
xfs_qm_scall_setqlim+0x9a/0x690
stack backtrace:
CPU: 0 PID: 28368 Comm: setquota Not tainted 3.9.0+ #3
Hardware name: Dell Computer Corporation L733r
/CA810E , BIOS A14 09/05/2001
c6456ca0 c6456ca0 c8f83cc8 c13fe5bd c8f83d40 c1060ee0 c14d241d c6456ad4
00006ed0 000003eb c196a618 c6456cf0 00000004 00000000 0001f60c c177c801
c19b033d 00000000 f089e33c 00000000 c6456930 4596f1d4 000003eb 00000000
Call Trace:
[<c13fe5bd>] dump_stack+0x16/0x18
[<c1060ee0>] __lock_acquire+0x17b0/0x17f0
[<c105dfae>] ? trace_hardirqs_off_caller+0x1e/0xc0
[<c104f795>] ? sched_clock_cpu+0xa5/0x100
[<c1061580>] lock_acquire+0x80/0x100
[<c11e8846>] ? xfs_trans_alloc+0x26/0x50
[<c10c737d>] __sb_start_write+0xad/0x1b0
[<c11e8846>] ? xfs_trans_alloc+0x26/0x50
[<c11e8846>] ? xfs_trans_alloc+0x26/0x50
[<c105df8b>] ? trace_hardirqs_on+0xb/0x10
[<c11e8846>] xfs_trans_alloc+0x26/0x50
[<c11f75ad>] xfs_qm_dqread+0xcd/0x360
[<c11f7b82>] xfs_qm_dqget+0x342/0x520
[<c11fa469>] xfs_qm_scall_setqlim+0xb9/0x690
[<c10b45ea>] ? might_fault+0x4a/0xa0
[<c10b4634>] ? might_fault+0x94/0xa0
[<c11ff8b4>] xfs_fs_set_dqblk+0x54/0xa0
[<c110fbf6>] quota_setxquota+0x76/0xc0
[<c1110233>] SyS_quotactl+0x513/0x5a0
[<c10c8834>] ? SyS_stat64+0x34/0x40
[<c1403df2>] ? sysenter_exit+0xf/0x1d
[<c105deb4>] ? trace_hardirqs_on_caller+0xf4/0x1c0
[<c1403dbf>] sysenter_do_call+0x12/0x36
XFS (sdb6): Mounting Filesystem
XFS (sdb6): Ending clean mount
XFS (sdb6): Mounting Filesystem
XFS (sdb6): Ending clean mount
XFS (sdb6): Quotacheck needed: Please wait.
XFS (sdb6): Quotacheck: Done.
- output mismatch (see /var/lib/xfstests/results/generic/230.out.bad)
--- tests/generic/230.out 2013-05-08 12:39:54.827612822 -0400
+++ /var/lib/xfstests/results/generic/230.out.bad 2013-05-08
16:51:08.063301955 -0400
@@ -12,9 +12,9 @@
pwrite64: Disk quota exceeded
Touch 3+4
Touch 5+6
-touch: cannot touch `SCRATCH_MNT/file6': Disk quota exceeded
+touch: cannot touch 'SCRATCH_MNT/file6': Disk quota exceeded
Touch 5
-touch: cannot touch `SCRATCH_MNT/file5': Disk quota exceeded
...
(Run 'diff -u tests/generic/230.out
/var/lib/xfstests/results/generic/230.out.bad' to see the entire diff)
XFS (sdb5): Mounting Filesystem
XFS (sdb5): Ending clean mount
xfs/155
_______________________________________________
xfs mailing list
xfs@xxxxxxxxxxx
http://oss.sgi.com/mailman/listinfo/xfs