Re: [Open-FCoE] System crashes with increased drive count

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, 2014-06-10 at 19:40 -0700, Jun Wu wrote:
> On Tue, Jun 10, 2014 at 3:38 PM, Vasu Dev <vasu.dev@xxxxxxxxxxxxxxx> wrote:
> > On Tue, 2014-06-10 at 09:46 -0700, Jun Wu wrote:
> >> This a Supermicro chassis with redundant power supplies. We see the
> >> same failures with both SSDs or HDDs.
> >> The same tests pass with non-fcoe protocol, i.e. iSCSI or AoE.
> >>
> >
> > Is iSCSI or AoE tests with same TCM core kernel with same target and
> > host NICs/switch ?
> 
> We tested AoE with the same hardware/switch and test setup. AoE works
> except that it is not enterprise protocol and it doesn't provide
> performance. It doesn't use TCM.
> 

You had fcoe working with lower queue depth and that should be yielding
lower performance as AoE beside AoE is not using TCM, so not correct
comparison. What about iSCSI, is that using TCM ? 

> >
> > What NICs in your chassis? As I mentioned before that "DCB and PFC PAUSE
> > typically used and required by fcoe", but you are using PAUSE and switch
> > cannot be eliminated as you mentioned before, these could affect more to
> > FCoE than other protocols, so can you ensure IO errors are not due to
> > frames losses w/o DCB/PFC in your setup ?
> 
> The NIC is:
> [root@poc1 log]# lspci | grep 82599
> 08:00.0 Ethernet controller: Intel Corporation 82599ES 10-Gigabit
> SFI/SFP+ Network Connection (rev 01)
> 
> The issue should not be caused by frame losses. The systems work fine
> with other protocols.

FCoE is less tolerant than others to packet losses & latency variations
for more FC like deterministic fabric performance and therefore no drop
ethernet is must for FCoE unlike others in the comparison, for instance
iSCSI would adapt tx window as per frame losses but no such thing in
FCoE.  Thus you cannot conclude that there is no frames losses just
because other works in the setup, iSCSI and AoE should work fine without
no drop ethernet, PAUSE or PFC, so can you confirm no frames losses
using "ethtool -S ethX" ? 

> >
> > While possibly abort issues at target with zero timeout values but you
> > could avoid them completely by increasing scsi timeout and disabling REC
> > as discussed before.
> >

Now that I know ixgbe (82599) in use, try few more things in addition to
suggestions above:- 

1) Disable irq balancer 
2) Find your ethX interrupts through "cat /proc/interrupts | grep ethX".
Identifying fcoe among them is tricky, it may be labeled with fcoe but
if not then identify them through intr activity while fcoe traffic on,
total 8 fcoe intr are used and pin them across first eight set of CPUs
used in your workloads.
3) Increase rings size from default 512 to 2K or 4K, just a hunch in
case frames dropped due to longer PAUSE or congestion in your setup.
4) Also monitor ethX stats beside fcoe hostX stats for anything stand
out odd there at "/sys/class/fc_host/hostX/statistics/"


<snip>
> 
> Is the following cmd_per_lun fcoe related? Its default value is 3. And
> it doesn't allow me to change.
> /sys/devices/pci0000:00/0000:00:05.0/0000:08:00.0/net/p4p1/ctlr_2/host9/scsi_host/host9/cmd_per_lun

I think this doesn't matter once device queue depth adjusted to 32 and
that can be adjusted. I mean this is used at scsi host alloc as initial
queue depth and later scsi device queue depth is adjusted to 32 through
slave_alloc call back and that can be adjusted
at /sys/block/sdX/device/queue_depth as you did before but not this. 

//Vasu
 

--
To unsubscribe from this list: send the line "unsubscribe target-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux SCSI]     [Kernel Newbies]     [Linux SCSI Target Infrastructure]     [Share Photos]     [IDE]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux ATA RAID]     [Linux IIO]     [Device Mapper]

  Powered by Linux