Re: SQ overflow seen running isert traffic

"Nicholas A. Bellinger" <nab@xxxxxxxxxxxxxxx> · Sun, 30 Oct 2016 20:40:08 -0700

Hi Steve, Potnuri, & Co,

On Tue, 2016-10-18 at 09:34 -0500, Steve Wise wrote:
> > 
> > > I tried out this change and it works fine with iwarp. I dont see SQ
> > > overflow. Apparently we have increased the sq too big to overflow. I am
> going
> > > to let it run with higher workloads for longer time, to see if it holds
> good.
> > 
> > Actually on second thought, this patch is an overkill. Effectively we
> > now set:
> > 
> > MAX_CMD=266
> > and max_rdma_ctx=128 so together we take 394 which seems to too much.
> > 
> > If we go by the scheme of 1 rdma + 1 send for each IO we need:
> > - 128 sends
> > - 128 rdmas
> > - 10 miscs
> > 
> > so this gives 266.
> > 
> > Perhaps this is due to the fact that iWARP needs to register memory for
> > rdma reads as well? (and also rdma writes > 128k for chelsio HW right?)
> >
> 
> iWARP definitely needs to register memory for the target of reads, due to
> REMOTE_WRITE requirement for the protocol.  The source of a write doesn't need
> to register memory, but the SGE depth can cause multiple WRITE WRs to be
> required to service the IO.  And in theory there should be some threshold where
> it might be better performance-wise to do a memory register + 1 WRITE vs X
> WRITEs.    
> 
> As you mentioned, the RW API should account for this, but perhaps it is still
> off some.  Bharat, have a look into the RDMA-RW API and let us see if we can
> figure out if the additional SQ depth it adds is sufficient.
>  
> > What is the workload you are running? with immediatedata enabled you
> > should issue reg+rdma_read+send only for writes > 8k.
> > 
> > Does this happen when you run only reads for example?
> > 
> > I guess its time to get the sq accounting into shape...
> 
> So to sum up - 2 issues:
> 
> 1) we believe the iSER + RW API correctly sizes the SQ, yet we're seeing SQ
> overflows.  So the SQ sizing needs more investigation.
> 
> 2) if the SQ is full, then the iSER/target code is supposed to resubmit.  And
> apparently that isn't working.
> 

For #2, target-core expects -ENOMEM or -EAGAIN return from fabric driver
callbacks to signal internal queue-full retry logic.  Otherwise, the
extra se_cmd->cmd_kref response SCF_ACK_KREF is leaked until session
shutdown and/or reinstatement occurs.

AFAICT, Potunri's earlier hung task with v4.8.y + ABORT_TASK is likely
the earlier v4.1+ regression:

https://github.com/torvalds/linux/commit/527268df31e57cf2b6d417198717c6d6afdb1e3e

That said, there is room for improvement in target-core queue-full error
signaling, and iscsi-target/iser-target callback error propagation.  

Sending out a series shortly to address these particular items.
Please have a look.

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html