Re: [5.0-rc2 regression] aio-dio append read-write race starts to block

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



This and three other xfstests (that were unfortunately not in the
regression test bucket but now will be) did fail as a sideeffect of
the credit changes he did.   Pavel posted fixes for these in
https://github.com/piastry/linux/commits/credit-reconnect8.  I
verified that the first six are the important ones, and these are the
ones that we are focusing on for review and testing to get in ASAP so
any additional review or testing would be appreciated.   I am running
these six on top of for-next (which has three minor fixes) and is
passing the attached in my tests but additional data would be useful.

ffcb47505156872a4e7aa7592a6260998e7da912 (HEAD ->
pavel-reconnect8-mini-6-patch) CIFS: Fix mounts if the client is low
on credits
dd98b1fba91454eed2a7562f99534265c886c2dc CIFS: Do not assume one
credit for async responses
40f4e13994691cf3c6659cc70a5e0cf5b668ad1c CIFS: Fix credit calculations
in compound mid callback
016ca04f33c325b9f488eca558b9def487a9e19e CIFS: Fix credit calculation
for encrypted reads with errors
ffdca6db5432ce21ba4428dd822b3d7116b7fefc CIFS: Fix credits
calculations for reads with errors
4c219bb001c4564496bccf8a195b60cac703b7ef CIFS: Do not reconnect TCP
session in add_credits()


for-next branch currently includes these three:
(for-next) smb3: Cleanup license mess
1b81bef5e216c07fa012efc0f9c44bc7c357f4b2 CIFS: Fix possible hang
during async MTU reads and writes
4881ad509be629c3e0ed6a58dbef3e16696586df cifs: fix memory leak of an
allocated cifs_ntsd structure



On Wed, Jan 23, 2019 at 2:55 AM Murphy Zhou <jencce.kernel@xxxxxxxxx> wrote:
>
> Hi,
>
> xfstests generic/465 on CIFS starts to block since:
>
> commit 8544f4aa9dd19a04d1244dae10feecc813ccf175
> Author: Pavel Shilovsky <pshilov@xxxxxxxxxxxxx>
> Date:   Sat Dec 22 12:40:05 2018 -0800
>
>    CIFS: Fix credit computation for compounded requests
>
> It's easy to reproduce v3.11 and v3.0 on latest Linus tree.
>
> Ctrl-c can not interrupt the tests, Ctrl-z + kill -9 %1 can clean some test
> process, but umounting cifs mountpoint will never return.
>
> Attaching script to reproduce on rpm/yum based distros:
> # single -f cifs -v 3.11 -t generic/465
>
> Thanks,
> Murphy
>
> # options:
>
> FSTYP         -- cifs
> PLATFORM      -- Linux/x86_64 8u 5.0.0-rc3-v5.0-rc3-27-g48b1619
> MKFS_OPTIONS  -- //localhost/scratch
> MOUNT_OPTIONS -- -o
> vers=3.11,username=root,password=redhat,sfu,mfsymlinks -o
> context=system_u:object_r:root_t:s0 //localhost/scratch /cifssch
>
> # ps output:
>
>  5705  5725  5725  5705 ttyS0     5725 S+       0   0:00  \_ /bin/bash
> /root/bin/single -f cifs -v 3.11 -t generic/465
>  5725  5840  5725  5705 ttyS0     5725 S+       0   0:00      \_
> /bin/bash ./check -T generic/465
>  5840  6077  5725  5705 ttyS0     5725 S+       0   0:00          \_
> /bin/bash ./tests/generic/465
>  6077  6378  5725  5705 ttyS0     5725 D+       0   0:00
> \_ src/aio-dio-regress/aio-dio-append-write-read-race -a 4096
> /cifsmnt/465.6077.4096
>  6077  6379  5725  5705 ttyS0     5725 S+       0   0:00
> \_ tee -a /root/xfstests-dev/results//generic/465.full
>
> # sysrq output:
>
> [93691.957955] sysrq: SysRq : Show Blocked State
> [93691.960158]   task                        PC stack   pid father
> [93691.963380] kworker/u64:0   D    0  5383      2 0x80000080
> [93691.967752] Workqueue: writeback wb_workfn (flush-cifs-6)
> [93691.971435] Call Trace:
> [93691.973692]  ? __schedule+0x24e/0x860
> [93691.976240]  schedule+0x28/0x70
> [93691.978696]  smb2_wait_mtu_credits+0x97/0x180 [cifs]
> [93691.982179]  ? finish_wait+0x80/0x80
> [93691.984764]  cifs_writepages+0x108/0xac0 [cifs]
> [93691.987732]  ? select_task_rq_fair+0x335/0xec0
> [93691.990519]  do_writepages+0x41/0xd0
> [93691.992826]  ? __percpu_counter_sum+0x56/0x60
> [93691.995505]  __writeback_single_inode+0x3d/0x350
> [93691.998423]  writeback_sb_inodes+0x1e5/0x480
> [93692.001032]  __writeback_inodes_wb+0x5d/0xb0
> [93692.003624]  wb_writeback+0x25f/0x2f0
> [93692.005906]  ? bpf_lru_populate+0x30/0x1b0
> [93692.008491]  ? cpumask_next+0x17/0x20
> [93692.010535]  wb_workfn+0x342/0x400
> [93692.012471]  ? __switch_to_asm+0x40/0x70
> [93692.014666]  process_one_work+0x1a1/0x3a0
> [93692.016938]  worker_thread+0x30/0x380
> [93692.018984]  ? mod_delayed_work_on+0x90/0x90
> [93692.021377]  kthread+0x112/0x130
> [93692.023258]  ? __kthread_parkme+0x70/0x70
> [93692.025502]  ret_from_fork+0x35/0x40
> [93692.027580] aio-dio-append- D    0  6378   6077 0x00000080
> [93692.030308] Call Trace:
> [93692.031602]  ? __schedule+0x24e/0x860
> [93692.033480]  ? kmem_cache_alloc+0x14d/0x1b0
> [93692.035595]  schedule+0x28/0x70
> [93692.037234]  wait_for_free_request+0xc3/0x190 [cifs]
> [93692.039817]  ? finish_wait+0x80/0x80
> [93692.041677]  compound_send_recv+0x109/0x690 [cifs]
> [93692.044136]  ? smb2_plain_req_init+0x11d/0x260 [cifs]
> [93692.046713]  smb2_compound_op+0x723/0x870 [cifs]
> [93692.049066]  ? mem_cgroup_try_charge+0x86/0x190
> [93692.051397]  ? smb2_query_path_info+0x8c/0x110 [cifs]
> [93692.053935]  smb2_query_path_info+0x8c/0x110 [cifs]
> [93692.056415]  cifs_get_inode_info+0x248/0xa80 [cifs]
> [93692.058869]  ? _cond_resched+0x15/0x30
> [93692.060771]  ? __kmalloc+0x164/0x200
> [93692.062614]  ? build_path_from_dentry_optional_prefix+0xc9/0x400 [cifs]
> [93692.065949]  ? build_path_from_dentry_optional_prefix+0xeb/0x400 [cifs]
> [93692.069140]  cifs_revalidate_dentry_attr+0xd6/0x390 [cifs]
> [93692.071685]  cifs_revalidate_dentry+0xf/0x20 [cifs]
> [93692.073966]  cifs_d_revalidate+0x20/0xa0 [cifs]
> [93692.076074]  path_openat+0x80e/0x16b0
> [93692.077897]  ? filemap_map_pages+0x1b3/0x390
> [93692.079865]  do_filp_open+0x93/0x100
> [93692.081569]  ? __check_object_size+0x15d/0x189
> [93692.083630]  do_sys_open+0x186/0x220
> [93692.085368]  do_syscall_64+0x55/0x1a0
> [93692.087102]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
> [93692.089238] RIP: 0033:0x7fd4362062af
> [93692.090764] Code: Bad RIP value.
> [93692.092174] RSP: 002b:00007ffe4e91fc40 EFLAGS: 00000246 ORIG_RAX:
> 0000000000000101
> [93692.095309] RAX: ffffffffffffffda RBX: 0000000000000001 RCX: 00007fd4362062af
> [93692.098409] RDX: 0000000000004241 RSI: 00007ffe4e922049 RDI: 00000000ffffff9c
> [93692.101394] RBP: 00007ffe4e91fe98 R08: 0000000000000000 R09: 0000000000000000
> [93692.104396] R10: 00000000000001a4 R11: 0000000000000246 R12: 00007ffe4e922049
> [93692.110203] R13: 0000000000001000 R14: 00007ffe4e922022 R15: 0000000000000000
>
>
> bisect log:
>
> git bisect start
> # good: [e1706720408e72fb883f6b151c2b3b23d8e7e5b2] phy: fix build
> breakage: add PHY_MODE_SATA
> git bisect good e1706720408e72fb883f6b151c2b3b23d8e7e5b2
> # bad: [6b529fb0a3eabf9c4cc3e94c11477250379ce6d8] Merge tag
> 'for-5.0-rc1-tag' of
> git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux
> git bisect bad 6b529fb0a3eabf9c4cc3e94c11477250379ce6d8
> # bad: [1dd8a3f6c619723ab442d6a27247d2f2153f3b11] Merge tag
> 'usb-5.0-rc2' of
> git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb
> git bisect bad 1dd8a3f6c619723ab442d6a27247d2f2153f3b11
> # bad: [8a26f0f781f56d3016b34a2217e346973d067e7b] CIFS: Fix credits
> calculation for cancelled requests
> git bisect bad 8a26f0f781f56d3016b34a2217e346973d067e7b
> # bad: [8544f4aa9dd19a04d1244dae10feecc813ccf175] CIFS: Fix credit
> computation for compounded requests
> git bisect bad 8544f4aa9dd19a04d1244dae10feecc813ccf175
> # good: [c715f89c4dab76317c773df2611af2dac4dea2b7] cifs: Fix a tiny
> potential memory leak
> git bisect good c715f89c4dab76317c773df2611af2dac4dea2b7
> # good: [33fa5c8b8a7dbe6353a56eaa654b790348890d42] CIFS: Do not set
> credits to 1 if the server didn't grant anything
> git bisect good 33fa5c8b8a7dbe6353a56eaa654b790348890d42
> # first bad commit: [8544f4aa9dd19a04d1244dae10feecc813ccf175] CIFS:
> Fix credit computation for compounded requests



--
Thanks,

Steve

Attachment: all-pass-bigger
Description: Binary data


[Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux