Re: [aarch64] refcount_t: use-after-free in NFS with 64k pages

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Cristian,

Thanks for digging into this.

Cristian Marussi <cristian.marussi@xxxxxxx> writes:

> Hi all,
>
> I'm recently chasing a bug that frequently appears during our internal
> LTP test-runs when performed on aarch64 HW (Juno) systems with an
> NFS-mounted root.
>
> The failure is NOT related to any specific LTP testcase and this issue
> has been observed only when Kernel is configured to use 64KB pages.
> (on the latest LTP Sept18 TAG test suite a Kernel crash happened in 4
> out of 5 test runs...always on a different random test case)
>
> I'm testing on Linus branch on 4.19-rc6 (but I can see it up to
> 4.19-rc8 and also on next) and it is reported since 4.17 at least (not
> sure about this...anyway it was NOT happening)

The stacktrace suggests it's the same issue that I'd reported earlier -

    https://lkml.org/lkml/2018/6/29/209

though without the analysis below.

[...]

> diff --git a/fs/nfs/pagelist.c b/fs/nfs/pagelist.c
> index bb5476a6d264..171813f9a291 100644
> --- a/fs/nfs/pagelist.c
> +++ b/fs/nfs/pagelist.c
> @@ -432,6 +432,15 @@ void nfs_free_request(struct nfs_page *req)
>
>  void nfs_release_request(struct nfs_page *req)
>  {
> +       /* WORKAROUND */
> +       if ((kref_read(&req->wb_kref) == 1) &&
> +           (req->wb_list.prev != &req->wb_list ||
> +            req->wb_list.next != &req->wb_list)) {

Are the last two conditions just checking that wb_list is not empty?

Thanks for looking at this.

Punit

> +               pr_warn("%s -- Forcing REFCOUNT++ on dirty req[%u]:%px
> ->prev:%px  ->next:%px\n",
> +                       __func__, kref_read(&req->wb_kref), req,
> +                       req->wb_list.prev, req->wb_list.next);
> +               kref_get(&req->wb_kref);
> +       }
>         kref_put(&req->wb_kref, nfs_page_group_destroy);
>  }
>  EXPORT_SYMBOL_GPL(nfs_release_request);
>
> I still have to figure out WHY this is happening when the system is
> loaded AND only with 64kb pages. (so basically the root cause...:<)
>
> What I could see is that the refcount bad-accounting seems to
> originate during the early phase of nfs_page allocation:
>
> - OK: nfs_create_request creates straight away an nfs_page wb_kref +2
>
> - OK: nfs_create_request creates a nfs_page with wb_kref +1 AND then
> wb_kref is immediately after incremented to +2 by an
> nfs_inode_add_request() before being moved across wb_list
>
> - FAIL: nfs_create_request creates a nfs_page with wb_kref +1 and it
> remains to +1 till when it starts being moved across lists.
>
> Any ideas or suggestions to triage why this condition is happening ?
> (I'm not really an NFS guy...:D)
>
> Thanks
>
> Cristian



[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux