Re: [5.0+ CIFS regression] fsstress hang on CIFS

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



чт, 14 мар. 2019 г. в 09:14, ronnie sahlberg <ronniesahlberg@xxxxxxxxx>:
>
> On Fri, Mar 15, 2019 at 12:25 AM Murphy Zhou <jencce.kernel@xxxxxxxxx> wrote:
> >
> > Hi,
> >
> > xfstests generic/013 on CIFS hangs on Linus tree now.
> >
> > Bisect points to
> >
> > commit 7091bcaba9f34c83e1e6f418b6de5c6d987571da (HEAD, refs/bisect/bad)
> > Author: Pavel Shilovsky <pshilov@xxxxxxxxxxxxx>
> > Date:   Wed Jan 30 16:58:09 2019 -0800
> >
> >     CIFS: Try to acquire credits at once for compound requests
> >
> > as the first bad commit.
> >
> > It should be easy to reproduce.
>
> I built a kernel at 7091bcaba9f34c83e1e6
> and it passes generic/013 against a windows2016 server for me.
>
> It could be a configuration difference.  What are the mount options
> you use and what server do you use?
>
> The code in the commit you find has been majorly re-worked in the
> for-next branch: https://github.com/smfrench/smb3-kernel
> Can you try cherry-picking these patches and see if the problem remains ?
>
> https://github.com/smfrench/smb3-kernel/commit/4b62ba8c7e9380f7274e3b4c90d41ab5e329ba08
> https://github.com/smfrench/smb3-kernel/commit/e2d79dccc7929257f7a2c824b397092de596b5c1
> https://github.com/smfrench/smb3-kernel/commit/15736c7f36cd55ef8c4e5df65c1a97ecd4b6e44f
> https://github.com/smfrench/smb3-kernel/commit/1bb731eaf331cec630a0371db2d0b79d5344d1e3
> https://github.com/smfrench/smb3-kernel/commit/a32fd3e57c3b15b456bbf39235d6555a88e20600
> https://github.com/smfrench/smb3-kernel/commit/9ed2d4dce312973e2f998d520f8c976c899e0dd9
>
>
>
> regards
> ronnie sahlberg
>
>
>
> >
> > Thanks,
> > M
> >
> > dmesg:
> > run fstests generic/013 at 2019-03-14 09:51:56
> > INFO: task kworker/3:1:104 blocked for more than 120 seconds.
> >       Not tainted 5.0.0-bisect-7091bcaba9f3+ #15
> > "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> > kworker/3:1     D    0   104      2 0x80000000
> > Workqueue: cifsiod cifs_uncached_writev_complete [cifs]
> > Call Trace:
> >  ? __schedule+0x24e/0x860
> >  schedule+0x28/0x70
> >  schedule_preempt_disabled+0xa/0x10
> >  __mutex_lock.isra.8+0x2d0/0x4b0
> >  cifs_uncached_writev_complete+0xcd/0x570 [cifs]
> >  ? __switch_to_asm+0x40/0x70
> >  ? __switch_to_asm+0x40/0x70
> >  ? __switch_to_asm+0x34/0x70
> >  ? __switch_to_asm+0x40/0x70
> >  process_one_work+0x1a1/0x3a0
> >  worker_thread+0x30/0x380
> >  ? mod_delayed_work_on+0x90/0x90
> >  kthread+0x112/0x130
> >  ? __kthread_parkme+0x70/0x70
> >  ret_from_fork+0x35/0x40
> > INFO: task kworker/5:1:106 blocked for more than 120 seconds.
> >       Not tainted 5.0.0-bisect-7091bcaba9f3+ #15
> > "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> > kworker/5:1     D    0   106      2 0x80000000
> > Workqueue: cifsiod cifs_uncached_writev_complete [cifs]
> > Call Trace:
> >  ? __schedule+0x24e/0x860
> >  ? ttwu_do_wakeup+0x19/0x140
> >  schedule+0x28/0x70
> >  schedule_preempt_disabled+0xa/0x10
> >  __mutex_lock.isra.8+0x2d0/0x4b0
> >  cifs_uncached_writev_complete+0xcd/0x570 [cifs]
> >  ? __switch_to_asm+0x40/0x70
> >  ? __switch_to_asm+0x34/0x70
> >  ? __switch_to_asm+0x40/0x70
> >  process_one_work+0x1a1/0x3a0
> >  worker_thread+0x30/0x380
> >  ? mod_delayed_work_on+0x90/0x90
> >  kthread+0x112/0x130
> >  ? __kthread_parkme+0x70/0x70
> >  ret_from_fork+0x35/0x40
> > INFO: task kworker/7:1:108 blocked for more than 120 seconds.
> >       Not tainted 5.0.0-bisect-7091bcaba9f3+ #15
> > "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> > kworker/7:1     D    0   108      2 0x80000000
> > Workqueue: cifsiod cifs_uncached_writev_complete [cifs]
> > Call Trace:
> >  ? __schedule+0x24e/0x860
> >  ? ttwu_do_wakeup+0x19/0x140
> >  schedule+0x28/0x70
> >  schedule_preempt_disabled+0xa/0x10
> >  __mutex_lock.isra.8+0x2d0/0x4b0
> >  cifs_uncached_writev_complete+0xcd/0x570 [cifs]
> >  ? __switch_to_asm+0x40/0x70
> >  ? __switch_to_asm+0x40/0x70
> >  ? __switch_to_asm+0x34/0x70
> >  ? __switch_to_asm+0x40/0x70
> >  process_one_work+0x1a1/0x3a0
> >  worker_thread+0x30/0x380
> >  ? mod_delayed_work_on+0x90/0x90
> >  kthread+0x112/0x130
> >  ? __kthread_parkme+0x70/0x70
> >  ret_from_fork+0x35/0x40
> > INFO: task kworker/7:3:391 blocked for more than 120 seconds.
> >       Not tainted 5.0.0-bisect-7091bcaba9f3+ #15
> > "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> > kworker/7:3     D    0   391      2 0x80000000
> > Workqueue: cifsiod cifs_uncached_writev_complete [cifs]
> > Call Trace:
> >  ? __schedule+0x24e/0x860
> >  ? ttwu_do_wakeup+0x19/0x140
> >  schedule+0x28/0x70
> >  schedule_preempt_disabled+0xa/0x10
> >  __mutex_lock.isra.8+0x2d0/0x4b0
> >  cifs_uncached_writev_complete+0xcd/0x570 [cifs]
> >  ? __switch_to_asm+0x40/0x70
> >  ? __switch_to_asm+0x40/0x70
> >  ? __switch_to_asm+0x34/0x70
> >  ? __switch_to_asm+0x40/0x70
> >  process_one_work+0x1a1/0x3a0
> >  worker_thread+0x30/0x380
> >  ? mod_delayed_work_on+0x90/0x90
> >  kthread+0x112/0x130
> >  ? __kthread_parkme+0x70/0x70
> >  ret_from_fork+0x35/0x40
> > INFO: task kworker/1:3:807 blocked for more than 120 seconds.
> >       Not tainted 5.0.0-bisect-7091bcaba9f3+ #15
> > "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> > kworker/1:3     D    0   807      2 0x80000000
> > Workqueue: cifsiod cifs_uncached_writev_complete [cifs]
> > Call Trace:
> >  ? __schedule+0x24e/0x860
> >  schedule+0x28/0x70
> >  schedule_preempt_disabled+0xa/0x10
> >  __mutex_lock.isra.8+0x2d0/0x4b0
> >  cifs_uncached_writev_complete+0xcd/0x570 [cifs]
> >  ? __switch_to_asm+0x40/0x70
> >  ? __switch_to_asm+0x40/0x70
> >  ? __switch_to_asm+0x34/0x70
> >  ? __switch_to_asm+0x40/0x70
> >  process_one_work+0x1a1/0x3a0
> >  worker_thread+0x30/0x380
> >  ? mod_delayed_work_on+0x90/0x90
> >  kthread+0x112/0x130
> >  ? __kthread_parkme+0x70/0x70
> >  ret_from_fork+0x35/0x40
> > INFO: task kworker/13:2:1077 blocked for more than 120 seconds.
> >       Not tainted 5.0.0-bisect-7091bcaba9f3+ #15
> > "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> > kworker/13:2    D    0  1077      2 0x80000080
> > Workqueue: cifsiod cifs_uncached_readv_complete [cifs]
> > Call Trace:
> >  ? __schedule+0x24e/0x860
> >  schedule+0x28/0x70
> >  schedule_preempt_disabled+0xa/0x10
> >  __mutex_lock.isra.8+0x2d0/0x4b0
> >  ? __switch_to_asm+0x40/0x70
> >  cifs_uncached_readv_complete+0x78/0x4d0 [cifs]
> >  ? __switch_to_asm+0x34/0x70
> >  process_one_work+0x1a1/0x3a0
> >  worker_thread+0x30/0x380
> >  ? mod_delayed_work_on+0x90/0x90
> >  kthread+0x112/0x130
> >  ? __kthread_parkme+0x70/0x70
> >  ret_from_fork+0x35/0x40
> > INFO: task kworker/3:0:1910 blocked for more than 120 seconds.
> >       Not tainted 5.0.0-bisect-7091bcaba9f3+ #15
> > "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> > kworker/3:0     D    0  1910      2 0x80000080
> > Workqueue: cifsiod cifs_uncached_writev_complete [cifs]
> > Call Trace:
> >  ? __schedule+0x24e/0x860
> >  schedule+0x28/0x70
> >  schedule_preempt_disabled+0xa/0x10
> >  __mutex_lock.isra.8+0x2d0/0x4b0
> >  cifs_uncached_writev_complete+0xcd/0x570 [cifs]
> >  ? __switch_to_asm+0x40/0x70
> >  ? __switch_to_asm+0x40/0x70
> >  ? __switch_to_asm+0x34/0x70
> >  ? __switch_to_asm+0x40/0x70
> >  process_one_work+0x1a1/0x3a0
> >  worker_thread+0x30/0x380
> >  ? mod_delayed_work_on+0x90/0x90
> >  kthread+0x112/0x130
> >  ? __kthread_parkme+0x70/0x70
> >  ret_from_fork+0x35/0x40
> > INFO: task kworker/5:5:3930 blocked for more than 120 seconds.
> >       Not tainted 5.0.0-bisect-7091bcaba9f3+ #15
> > "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> > kworker/5:5     D    0  3930      2 0x80000080
> > Workqueue: cifsiod cifs_uncached_writev_complete [cifs]
> > Call Trace:
> >  ? __schedule+0x24e/0x860
> >  schedule+0x28/0x70
> >  schedule_preempt_disabled+0xa/0x10
> >  __mutex_lock.isra.8+0x2d0/0x4b0
> >  cifs_uncached_writev_complete+0xcd/0x570 [cifs]
> >  ? __switch_to_asm+0x40/0x70
> >  ? __switch_to_asm+0x40/0x70
> >  ? __switch_to_asm+0x34/0x70
> >  ? __switch_to_asm+0x40/0x70
> >  process_one_work+0x1a1/0x3a0
> >  worker_thread+0x30/0x380
> >  ? mod_delayed_work_on+0x90/0x90
> >  kthread+0x112/0x130
> >  ? __kthread_parkme+0x70/0x70
> >  ret_from_fork+0x35/0x40
> > INFO: task kworker/1:2:3932 blocked for more than 120 seconds.
> >       Not tainted 5.0.0-bisect-7091bcaba9f3+ #15
> > "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> > kworker/1:2     D    0  3932      2 0x80000080
> > Workqueue: cifsiod cifs_uncached_readv_complete [cifs]
> > Call Trace:
> >  ? __schedule+0x24e/0x860
> >  schedule+0x28/0x70
> >  schedule_preempt_disabled+0xa/0x10
> >  __mutex_lock.isra.8+0x2d0/0x4b0
> >  ? __switch_to_asm+0x40/0x70
> >  cifs_uncached_readv_complete+0x78/0x4d0 [cifs]
> >  ? __switch_to_asm+0x34/0x70
> >  process_one_work+0x1a1/0x3a0
> >  worker_thread+0x30/0x380
> >  ? mod_delayed_work_on+0x90/0x90
> >  kthread+0x112/0x130
> >  ? __kthread_parkme+0x70/0x70
> >  ret_from_fork+0x35/0x40
> > INFO: task kworker/1:5:3934 blocked for more than 120 seconds.
> >       Not tainted 5.0.0-bisect-7091bcaba9f3+ #15
> > "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> > kworker/1:5     D    0  3934      2 0x80000080
> > Workqueue: cifsiod cifs_uncached_readv_complete [cifs]
> > Call Trace:
> >  ? __schedule+0x24e/0x860
> >  schedule+0x28/0x70
> >  schedule_preempt_disabled+0xa/0x10
> >  __mutex_lock.isra.8+0x2d0/0x4b0
> >  ? __switch_to_asm+0x40/0x70
> >  cifs_uncached_readv_complete+0x78/0x4d0 [cifs]
> >  ? __switch_to_asm+0x34/0x70
> >  process_one_work+0x1a1/0x3a0
> >  worker_thread+0x30/0x380
> >  ? mod_delayed_work_on+0x90/0x90
> >  kthread+0x112/0x130
> >  ? __kthread_parkme+0x70/0x70
> >  ret_from_fork+0x35/0x40

(cc'ing Long)

I think this is one more occurrence of the known bug in the direct IO
resend code path: the process is most likely stuck in
cifs_resend_wdata thus not allowing other process to acquire the
mutex. We had two patches fixing it that is currently being re-worked:

https://patchwork.kernel.org/patch/10836349/
https://patchwork.kernel.org/patch/10836355/

But you can try them out to see if they fix your issue.

--
Best regards,
Pavel Shilovsky




[Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux