On 03/27/2019 06:24 PM, Alex Williamson wrote:
On Wed, 27 Mar 2019 09:38:47 -0400
Farhan Ali <alifm@xxxxxxxxxxxxx> wrote:
Hi Alex,
I am seeing a warning in the vfio driver with the 5.0 linux kernel on an
s390 system:
[ 4050.264464] ------------[ cut here ]------------
[ 4050.264508] do not call blocking ops when !TASK_RUNNING; state=1 set
at [<00000000b33c00e2>] prepare_to_wait_event+0x14a/0x188
[ 4050.264529] WARNING: CPU: 12 PID: 35924 at kernel/sched/core.c:6112
__might_sleep+0x76/0x90
....
[ 4050.264636] CPU: 12 PID: 35924 Comm: sh Tainted: G W
5.0.0-00016-g4ef69cfc84b1-dirty #12
[ 4050.264646] Hardware name: IBM 2964 NE1 749 (LPAR)
[ 4050.264654] Krnl PSW : 0704d00180000000 000000000017bbae
(__might_sleep+0x76/0x90)
[ 4050.264668] R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:3 CC:1
PM:0 RI:0 EA:3
[ 4050.264678] Krnl GPRS: 00000000361d853f 0000000080000000
0000000000000072 00000000001d05e2
[ 4050.264691] 0000000000000007 0000000000bb3c20
000003ff800d7abe 000003e013823b58
[ 4050.264700] 0000000000000002 000003ff800d7abe
0000000000000000 000000000000038c
[ 4050.264708] 0000000000dd96e4 0000000800000000
000000000017bbaa 000003e013823910
[ 4050.264724] Krnl Code: 000000000017bb9e: c0200062dc5d larl
%r2,dd7458
000000000017bba4: c0e5fffe5012 brasl
%r14,145bc8
#000000000017bbaa: a7f40001 brc
15,17bbac
>000000000017bbae: b904004a lgr
%r4,%r10
000000000017bbb2: b904003b lgr
%r3,%r11
000000000017bbb6: b904002c lgr
%r2,%r12
000000000017bbba: ebaff0a00004 lmg
%r10,%r15,160(%r15)
000000000017bbc0: c0f4fffffe8c brcl
15,17b8d8
[ 4050.264756] Call Trace:
[ 4050.264765] ([<000000000017bbaa>] __might_sleep+0x72/0x90)
[ 4050.264774] [<0000000000b97edc>] __mutex_lock+0x44/0x8c0
[ 4050.264782] [<0000000000b9878a>] mutex_lock_nested+0x32/0x40
[ 4050.264793] [<000003ff800d7abe>] vfio_group_get_device+0x36/0xa8 [vfio]
[ 4050.264803] [<000003ff800d87c0>] vfio_del_group_dev+0x238/0x378 [vfio]
[ 4050.264813] [<000003ff8015f67c>] mdev_remove+0x3c/0x68 [mdev]
[ 4050.264825] [<00000000008e01b0>]
device_release_driver_internal+0x168/0x268
[ 4050.264834] [<00000000008de692>] bus_remove_device+0x162/0x190
[ 4050.264843] [<00000000008daf42>] device_del+0x1e2/0x368
[ 4050.264851] [<00000000008db12c>] device_unregister+0x64/0x88
[ 4050.264862] [<000003ff8015ed84>] mdev_device_remove+0xec/0x130 [mdev]
[ 4050.264872] [<000003ff8015f074>] remove_store+0x6c/0xa8 [mdev]
[ 4050.264881] [<000000000046f494>] kernfs_fop_write+0x14c/0x1f8
[ 4050.264890] [<00000000003c1530>] __vfs_write+0x38/0x1a8
[ 4050.264899] [<00000000003c187c>] vfs_write+0xb4/0x198
[ 4050.264908] [<00000000003c1af2>] ksys_write+0x5a/0xb0
[ 4050.264916] [<0000000000b9e270>] system_call+0xdc/0x2d8
[ 4050.264925] 4 locks held by sh/35924:
[ 4050.264933] #0: 000000001ef90325 (sb_writers#4){.+.+}, at:
vfs_write+0x9e/0x198
[ 4050.264948] #1: 000000005c1ab0b3 (&of->mutex){+.+.}, at:
kernfs_fop_write+0x1cc/0x1f8
[ 4050.264963] #2: 0000000034831ab8 (kn->count#297){++++}, at:
kernfs_remove_self+0x12e/0x150
[ 4050.264979] #3: 00000000e152484f (&dev->mutex){....}, at:
device_release_driver_internal+0x5c/0x268
[ 4050.264993] Last Breaking-Event-Address:
[ 4050.265002] [<000000000017bbaa>] __might_sleep+0x72/0x90
[ 4050.265010] irq event stamp: 7039
[ 4050.265020] hardirqs last enabled at (7047): [<00000000001cee7a>]
console_unlock+0x6d2/0x740
[ 4050.265029] hardirqs last disabled at (7054): [<00000000001ce87e>]
console_unlock+0xd6/0x740
[ 4050.265040] softirqs last enabled at (6416): [<0000000000b8fe26>]
__udelay+0xb6/0x100
[ 4050.265049] softirqs last disabled at (6415): [<0000000000b8fe06>]
__udelay+0x96/0x100
[ 4050.265057] ---[ end trace d04a07d39d99a9f9 ]---
The problem seems to be the function vfio_dev_present which is the
condition to wait_event_interruptible_timeout.
wait_event_interruptible() sets the state of the current task to
TASK_INTERRUPTIBLE, then checks the condition. And in vfio_dev_present
we call vfio_group_get_device which then tries to acquire the mutex
group->device_lock.
I am not too familiar with vfio core driver code yet, and I am not sure
if the solution to this problem will be something like described in
https://lwn.net/Articles/628628/. If that is the solution we would want
I could take a crack at writing a patch.
Hmm, strange I've never seen this. In any case, yes, thanks for
reporting it and it does seem like the issue discussed there. If
you're interested in tackling it, that would be much appreciated. Also
note the recent thread by Parav working on fixes to the mdev-core
removal and creation of devices. If this issue is somehow unique to
mdev devices (maybe why I haven't seen it in vfio-pci?) then that
series might bring mdev more in line with vfio-pci semantics. Thanks,
Alex
I have only noticed it with mediated devices, more specifically vfio-ccw
device. And I have briefly looked at Parav's series, but I don't think
his patches tackle this issue.
I will submit a patch for this, for review soon.
Thanks
Farhan