Re: [PATCH 1/1] NFSD: fix WARN_ON_ONCE in __queue_delayed_work

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, 2023-01-10 at 11:58 -0800, dai.ngo@xxxxxxxxxx wrote:
>
> On 1/10/23 11:30 AM, Jeff Layton wrote:
>
> > >
> > >
> > Looking over the traces that Mike posted, I suspect this is the real
> > bug, particularly if the server is being restarted during this test.
>
> Yes, I noticed the WARN_ON_ONCE(timer->function != delayed_work_timer_fn)
> too and this seems to indicate some kind of corruption. However, I'm not
> sure if Mike's test restarts the nfs-server service. This could be a bug
> in work queue module when it's under stress.

My reproducer was to merely mount and traverse/md5sum, while that was
going on, fire up LTP's min_free_kbytes testcase (memory hog from hell)
on the server.  Systemthing may well be restarting the server service
in response to oomkill.  In fact, the struct delayed_work in question
at WARN_ON_ONCE() time didn't look the least bit ready for business.

FWIW, I had noticed the missing cancel while eyeballing, and stuck one
next to the existing one as a hail-mary, but that helped not at all.

crash> delayed_work ffff8881601fab48
struct delayed_work {
  work = {
    data = {
      counter = 1
    },
    entry = {
      next = 0x0,
      prev = 0x0
    },
    func = 0x0
  },
  timer = {
    entry = {
      next = 0x0,
      pprev = 0x0
    },
    expires = 0,
    function = 0x0,
    flags = 0
  },
  wq = 0x0,
  cpu = 0
}




[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux