Re: kernel 6.10

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On 7/28/2024 1:33 AM, Dan Aloni wrote:
On 2024-07-28 02:57:42, Hristo Venev wrote:
On Sun, 2024-07-28 at 02:34 +0200, Hristo Venev wrote:
On Sun, 2024-07-21 at 16:40 +0000, Trond Myklebust wrote:
On Sun, 2024-07-21 at 14:03 +0300, Dan Aloni wrote:
On 2024-07-16 16:09:54, Trond Myklebust wrote:
[..]
	gdb -batch -quiet -ex 'list
*(nfs_folio_find_private_request+0x3c)' -ex quit nfs.ko


I suspect this will show that the problem is occurring inside
the
function folio_get_private(), but I'd like to be sure that is
the
case.
I would suspect that `->private_data` gets corrupted somehow.
Maybe
the folio_test_private() call needs to be protected by either the
&mapping->i_private_lock, or folio lock?

If the problem is indeed happening in "folio_get_private()", then
the
dereferenced address value of 00000000000003a6 would seem to
indicate
that the pointer value of 'folio' itself is screwed up, doesn't it?
The NULL dereference appears to be at the `WARN_ON_ONCE(req->wb_head
!=
req);` check.

On my kernel the offset inside `nfs_folio_find_private_request` is
+0x3f, but the address is again 0x3a6, meaning that `req` is for some
reason set to 0x356 (the crash is on `cmp %rbp,0x50(%rbp)`).
... and 0x356 happens to be NETFS_FOLIO_COPY_TO_CACHE. Maybe the
NETFS_RREQ_USE_PGPRIV2 flag is lost somehow?
Seems NETFS_FOLIO_COPY_TO_CACHE relates to fscache use, you are
activating that, right?

Also in addition to my suggestion earlier, I think perhaps we need to
use `folio_attach_private` and `folio_detach_private` instead of
directly using `folio_set_private`, for which the NFS client seems to be
the only direct user.
On my side Yes, fscache is used




[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux