RE: Crash in nvmet_req_init() - null req->rsp pointer

"Steve Wise" <swise@xxxxxxxxxxxxxxxxxxxxx> · Fri, 31 Aug 2018 08:01:08 -0500

> 
> 
> >> Hey Sagi, it hits the empty rsp list path often with your debug patch.
> >> I added code to BUG_ON() after 10 times and I have a crash dump I'm
> >> looking at.
> >>
> >> Isn't the rsp list supposed to be sized such that it will never be empty
> >> when a new rsp is needed?  I wonder if there is a leak.
> 
> Doesn't look from my scan..
> 
> > I do see that during this heavy load, the rdma send queue "full"
> > condition gets hit often:
> >
> > static bool nvmet_rdma_execute_command(struct nvmet_rdma_rsp *rsp)
> > {
> >          struct nvmet_rdma_queue *queue = rsp->queue;
> >
> >          if (unlikely(atomic_sub_return(1 + rsp->n_rdma,
> >                          &queue->sq_wr_avail) < 0)) {
> >                  pr_debug("IB send queue full (needed %d): queue %u
> > cntlid %u\n",
> >                                  1 + rsp->n_rdma, queue->idx,
> >                                  queue->nvme_sq.ctrl->cntlid);
> >                  atomic_add(1 + rsp->n_rdma, &queue->sq_wr_avail);
> >                  return false;
> >          }
> >
> > ...
> >
> > So commands are getting added to the wr_wait list:
> >
> > static void nvmet_rdma_handle_command(struct nvmet_rdma_queue
> *queue,
> >                  struct nvmet_rdma_rsp *cmd)
> > {
> > ...
> >          if (unlikely(!nvmet_rdma_execute_command(cmd))) {
> >                  spin_lock(&queue->rsp_wr_wait_lock);
> >                  list_add_tail(&cmd->wait_list, &queue->rsp_wr_wait_list);
> >                  spin_unlock(&queue->rsp_wr_wait_lock);
> >          }
> > ...
> >
> >
> > Perhaps there's some bug in the wr_wait_list processing of deferred
> > commands?  I don't see anything though.
> 
> I assume this could happen if under heavy load the device send
> completions are slower than the rate incoming commands arrival
> (perhaps device and/or sw).
> 
> Because we post recv before sending the response back, there is
> a window where host can send us a new command before the send
> completion
> arrived, this is why we allocate more.
> 
> However, I think that nothing prevents that under heavy load the gap
> is growing until we exhaust 2x rsps.
> 
> So perhaps this is something we actually need to account for it...

Thanks for the explanation.  Yes, I believe we do.  Will you post the formal patch?  If it is the same as the one I already confirmed, you can add my test-by tag.

Thanks,

Steve.