On Tue, 9 Apr 2019 09:13:48 -0400 Farhan Ali <alifm@xxxxxxxxxxxxx> wrote: > On 04/03/2019 02:22 PM, Farhan Ali wrote: > > vfio_dev_present() which is the condition to > > wait_event_interruptible_timeout(), will call vfio_group_get_device > > and try to acquire the mutex group->device_lock. > > > > wait_event_interruptible_timeout() will set the state of the current > > task to TASK_INTERRUPTIBLE, before doing the condition check. This > > means that we will try to accquire the mutex while already in a > > sleeping state. The scheduler warns us by giving the following > > warning: > > > > [ 4050.264464] ------------[ cut here ]------------ > > [ 4050.264508] do not call blocking ops when !TASK_RUNNING; state=1 set at [<00000000b33c00e2>] prepare_to_wait_event+0x14a/0x188 > > [ 4050.264529] WARNING: CPU: 12 PID: 35924 at kernel/sched/core.c:6112 __might_sleep+0x76/0x90 > > .... > > > > 4050.264756] Call Trace: > > [ 4050.264765] ([<000000000017bbaa>] __might_sleep+0x72/0x90) > > [ 4050.264774] [<0000000000b97edc>] __mutex_lock+0x44/0x8c0 > > [ 4050.264782] [<0000000000b9878a>] mutex_lock_nested+0x32/0x40 > > [ 4050.264793] [<000003ff800d7abe>] vfio_group_get_device+0x36/0xa8 [vfio] > > [ 4050.264803] [<000003ff800d87c0>] vfio_del_group_dev+0x238/0x378 [vfio] > > [ 4050.264813] [<000003ff8015f67c>] mdev_remove+0x3c/0x68 [mdev] > > [ 4050.264825] [<00000000008e01b0>] device_release_driver_internal+0x168/0x268 > > [ 4050.264834] [<00000000008de692>] bus_remove_device+0x162/0x190 > > [ 4050.264843] [<00000000008daf42>] device_del+0x1e2/0x368 > > [ 4050.264851] [<00000000008db12c>] device_unregister+0x64/0x88 > > [ 4050.264862] [<000003ff8015ed84>] mdev_device_remove+0xec/0x130 [mdev] > > [ 4050.264872] [<000003ff8015f074>] remove_store+0x6c/0xa8 [mdev] > > [ 4050.264881] [<000000000046f494>] kernfs_fop_write+0x14c/0x1f8 > > [ 4050.264890] [<00000000003c1530>] __vfs_write+0x38/0x1a8 > > [ 4050.264899] [<00000000003c187c>] vfs_write+0xb4/0x198 > > [ 4050.264908] [<00000000003c1af2>] ksys_write+0x5a/0xb0 > > [ 4050.264916] [<0000000000b9e270>] system_call+0xdc/0x2d8 > > [ 4050.264925] 4 locks held by sh/35924: > > [ 4050.264933] #0: 000000001ef90325 (sb_writers#4){.+.+}, at: vfs_write+0x9e/0x198 > > [ 4050.264948] #1: 000000005c1ab0b3 (&of->mutex){+.+.}, at: kernfs_fop_write+0x1cc/0x1f8 > > [ 4050.264963] #2: 0000000034831ab8 (kn->count#297){++++}, at: kernfs_remove_self+0x12e/0x150 > > [ 4050.264979] #3: 00000000e152484f (&dev->mutex){....}, at: device_release_driver_internal+0x5c/0x268 > > [ 4050.264993] Last Breaking-Event-Address: > > [ 4050.265002] [<000000000017bbaa>] __might_sleep+0x72/0x90 > > [ 4050.265010] irq event stamp: 7039 > > [ 4050.265020] hardirqs last enabled at (7047): [<00000000001cee7a>] console_unlock+0x6d2/0x740 > > [ 4050.265029] hardirqs last disabled at (7054): [<00000000001ce87e>] console_unlock+0xd6/0x740 > > [ 4050.265040] softirqs last enabled at (6416): [<0000000000b8fe26>] __udelay+0xb6/0x100 > > [ 4050.265049] softirqs last disabled at (6415): [<0000000000b8fe06>] __udelay+0x96/0x100 > > [ 4050.265057] ---[ end trace d04a07d39d99a9f9 ]--- > > > > Let's fix this as described in the articlehttps://lwn.net/Articles/628628/. > > > > Signed-off-by: Farhan Ali<alifm@xxxxxxxxxxxxx> > > --- > > > > I have already tested in my environment and the warning goes > > away for me with this patch. But appreciate further testing > > and review feedback on the patch. > > > > Thanks > > Farhan > > > > > > ChangeLog > > --------- > > v1 -> v2 > > - Keep the same behavior as before, so the task goes into > > TASK_UNINTERRUPTIBLE state after being interrupted once > > > > --- > > > > Hi Alex, > > Polite ping :) > > Do you have any other feedback regarding the patch? Hi Farhan, It looks fine to me, it's pending integration and further testing on my end. I'll target this for v5.2. Thanks, Alex