Re: [bug report] task hang while testing xfstests generic/323

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Feb 28, 2019 at 5:11 AM Jiufei Xue <jiufei.xue@xxxxxxxxxxxxxxxxx> wrote:
>
> Hi,
>
> when I tested xfstests/generic/323 with NFSv4.1 and v4.2, the task
> changed to zombie occasionally while a thread is hanging with the
> following stack:
>
> [<0>] rpc_wait_bit_killable+0x1e/0xa0 [sunrpc]
> [<0>] nfs4_do_close+0x21b/0x2c0 [nfsv4]
> [<0>] __put_nfs_open_context+0xa2/0x110 [nfs]
> [<0>] nfs_file_release+0x35/0x50 [nfs]
> [<0>] __fput+0xa2/0x1c0
> [<0>] task_work_run+0x82/0xa0
> [<0>] do_exit+0x2ac/0xc20
> [<0>] do_group_exit+0x39/0xa0
> [<0>] get_signal+0x1ce/0x5d0
> [<0>] do_signal+0x36/0x620
> [<0>] exit_to_usermode_loop+0x5e/0xc2
> [<0>] do_syscall_64+0x16c/0x190
> [<0>] entry_SYSCALL_64_after_hwframe+0x44/0xa9
> [<0>] 0xffffffffffffffff
>
> Since commit 12f275cdd163(NFSv4: Retry CLOSE and DELEGRETURN on
> NFS4ERR_OLD_STATEID), the client will retry to close the file when
> stateid generation number in client is lower than server.
>
> The original intention of this commit is retrying the operation while
> racing with an OPEN. However, in this case the stateid generation remains
> mismatch forever.
>
> Any suggestions?

Can you include a network trace of the failure? Is it possible that
the server has crashed on reply to the close and that's why the task
is hung? What server are you testing against?

I have seen trace where close would get ERR_OLD_STATEID and would
still retry with the same open state until it got a reply to the OPEN
which changed the state and when the client received reply to that,
it'll retry the CLOSE with the updated stateid.



[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux