Re: [PATCH] bdi: Fix another oops in wb_workfn()

Tejun Heo <tj@xxxxxxxxxx> · Tue, 29 May 2018 06:46:14 -0700

On Sun, May 27, 2018 at 01:43:45PM +0900, Tetsuo Handa wrote:
> Tejun Heo wrote:
> > On Sun, May 27, 2018 at 11:21:25AM +0900, Tetsuo Handa wrote:
> > > syzbot is still hitting NULL pointer dereference at wb_workfn() [1].
> > > This might be because we overlooked that delayed_work_timer_fn() does not
> > > check WB_registered before calling __queue_work() while mod_delayed_work()
> > > does not wait for already started delayed_work_timer_fn() because it uses
> > > del_timer() rather than del_timer_sync().
> > 
> > It shouldn't be that as dwork timer is an irq safe timer.  Even if
> > that's the case, the right thing to do would be fixing workqueue
> > rather than reaching into workqueue internals from backing-dev code.
> > 
> 
> Do you think that there is possibility that __queue_work() is almost concurrently
> executed from two CPUs, one from mod_delayed_work(bdi_wq, &wb->dwork, 0) from
> wb_shutdown() path (which is called without spin_lock_bh(&wb->work_lock)) and
> the other from delayed_work_timer_fn() path (which is called without checking
> WB_registered bit under spin_lock_bh(&wb->work_lock)) ?

__queue_work() is gated by WORK_STRUCT_PENDING_BIT, so I don't see how
multiple instances would execute concurrently for the same work item.

Thanks.

-- 
tejun