Re: RCU stalls and GPFs in ceph/netfs

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sun, Jul 28, 2024 at 1:45 PM Jeff Layton <jlayton@xxxxxxxxxx> wrote:
> That is really weird. AFAICT, 2e9d7e4b984a61 is just removing some
> wrapper functions and changing the names of some others. There should
> be no functional changes there.

Exactly what I thought, I could not imagine how this commit could
cause such a bug. The only chance was that netfs_rreq_assess() now
always directly calls netfs_rreq_completed(), but not
netfs_rreq_write_to_cache(), but I don't know what that means - this
different code path could be a candidate for doing something
differently. Maybe it's an old bug that only got revealed by this
change.

Anyway, I tried to verify this and the preceding commit for hours, and
the picture was consistent: that commit reproduces the RCU stall
within minutes (though only 50% or so of all boots), and the previous
commit never did. There is still a tiny chance that I just wasn't
trying hard enough. I'm out of ideas, and all I can do now is start
digging really deeply into this code, but I thought it would be more
productive to reach out to the people who wrote it.

Max





[Index of Archives]     [CEPH Users]     [Ceph Large]     [Ceph Dev]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux