Ping again. Hi Josef, could you take a look? On Fri, Jan 21, 2022 at 4:34 PM Yongji Xie <xieyongji@xxxxxxxxxxxxx> wrote: > > Ping. > > On Wed, Jan 5, 2022 at 1:36 PM Yongji Xie <xieyongji@xxxxxxxxxxxxx> wrote: > > > > On Wed, Jan 5, 2022 at 2:06 AM Josef Bacik <josef@xxxxxxxxxxxxxx> wrote: > > > > > > On Tue, Jan 04, 2022 at 01:31:47PM +0800, Yongji Xie wrote: > > > > On Tue, Jan 4, 2022 at 12:10 AM Josef Bacik <josef@xxxxxxxxxxxxxx> wrote: > > > > > > > > > > On Thu, Dec 30, 2021 at 12:01:23PM +0800, Yongji Xie wrote: > > > > > > On Thu, Dec 30, 2021 at 1:35 AM Christoph Hellwig <hch@xxxxxxxxxxxxx> wrote: > > > > > > > > > > > > > > On Mon, Dec 27, 2021 at 05:12:41PM +0800, Xie Yongji wrote: > > > > > > > > The rescuer thread might take over the works queued on > > > > > > > > the workqueue when the worker thread creation timed out. > > > > > > > > If this happens, we have no chance to create multiple > > > > > > > > recv threads which causes I/O hung on this nbd device. > > > > > > > > > > > > > > If a workqueue is used there aren't really 'receive threads'. > > > > > > > What is the deadlock here? > > > > > > > > > > > > We might have multiple recv works, and those recv works won't quit > > > > > > unless the socket is closed. If the rescuer thread takes over those > > > > > > works, only the first recv work can run. The I/O needed to be handled > > > > > > in other recv works would be hung since no thread can handle them. > > > > > > > > > > > > > > > > I'm not following this explanation. What is the rescuer thread you're talking > > > > > > > > https://www.kernel.org/doc/html/latest/core-api/workqueue.html#c.rescuer_thread > > > > > > > > > > Ahhh ok now I see, thanks, I didn't know this is how this worked. > > > > > > So what happens is we do the queue_work(), this needs to do a GFP_KERNEL > > > allocation internally, we are unable to satisfy this, and thus the work gets > > > pushed onto the rescuer thread. > > > > > > Then the rescuer thread can't be used in the future because it's doing this long > > > running thing. > > > > > > > Yes. > > > > > I think the correct thing to do here is simply drop the WQ_MEM_RECLAIM bit. It > > > makes sense for workqueue's that are handling the work of short lived works that > > > are in the memory reclaim path. That's not what these workers are doing, yes > > > they are in the reclaim path, but they run the entire time the device is up. > > > The actual work happens as they process incoming requests. AFAICT > > > WQ_MEM_RECLAIM doesn't affect the actual allocations that the worker thread > > > needs to do, which is what I think the intention was in using WQ_MEM_RECLAIM, > > > which isn't really what it's used for. > > > > > > tl;dr, just remove thee WQ_MEM_RECLAIM flag completely and I think that's good > > > enough? Thanks, > > > > > > > In the reconnect case, we still need to call queue_work() while the > > device is running. So it looks like we can't simply remove the > > WQ_MEM_RECLAIM flag. > > > > Thanks, > > Yongji