Re: heavy load and timeouts

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, 2017-05-03 at 07:29 -0400, Michael Di Domenico wrote:
> On Tue, May 2, 2017 at 1:14 AM, Nicholas A. Bellinger
> <nab@xxxxxxxxxxxxxxx> wrote:
> >> but when i start up multiple initiators as few as 50, i start to get
> >> errors in syslog
> >>
> >> Exiting Time2Retain handler because session_reinstatement=1
> >> tcp_sendpage() failure: ....
> >>
> >> the tcp_sendpage() error has various numbers after it, the top two
> >> highest count of numbers are "-512" and "3720"
> >>
> >
> > Btw, this means that TCP connections are being reset.
> >
> > Without more information it's hard to know which side (eg: initiator or
> > target) is forcing the TCP reset.
> >
> > Can we see some logs for both target + initiator side when this
> > occurs..?
> >
> 
> Because the machines are in an enclave, copy-paste logs are going to
> be hard to come by.  Is there something specific in the logs i should
> look for?
> 

We really need some form of target + host logs for diagnoses, along with
the target configuration to give context to what you've observed.

> > One thing that comes to mind is the per target endpoint (eg: TPG
> > attribute in targetcli) is 'default_cmdsn_depth', which by default is
> > set to 64.  This controls how many I/Os can be in flight for a single
> > iscsi session.
> >
> > I"m not sure what the default is for tgtd in rhel6, but with enough
> > initiators connected you might want to consider setting this to
> > something like 16, 8, or lower.
> 
> I'm pretty sure i tried tweaking that parameter (as it was suggested
> elsewhere on the internet), but it didn't make a difference.  But i'll
> recheck my notes and re-run the tests to be sure.

Yeah, I'd start out really conservative at default_cmdsn_depth=1 and
start working up from there to understand if it's something related to
the number of outstanding I/O that is causing TCP connections to be
constantly be dropped.

Btw, note when you change this TPG attribute value it doesn't effect any
of the running sessions, it only effects demo-node (eg: not explicit
NodeACLS) new sessions created or sessions restarted after the change.

If you are using explicit NodeACLs, this can be changed on the fly for
each initiator via:

  echo 1 > /sys/kernel/config/target/iscsi/$TARGETNAME/$TPGT/acls/$INITIATORNAME/cmdsn_depth

--
To unsubscribe from this list: send the line "unsubscribe target-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux SCSI]     [Kernel Newbies]     [Linux SCSI Target Infrastructure]     [Share Photos]     [IDE]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux ATA RAID]     [Linux IIO]     [Device Mapper]

  Powered by Linux