Re: lio taget iscsi multiple core performance

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, 2013-10-03 at 16:14 -0500, Xianghua Xiao wrote:
> On Thu, Oct 3, 2013 at 2:18 PM, Nicholas A. Bellinger
> <nab@xxxxxxxxxxxxxxx> wrote:
> > On Thu, 2013-10-03 at 09:16 -0500, Xianghua Xiao wrote:
> >> The IRQ is balanced to all cores(cat /proc/interrupts), the option is
> >> turned on via menuconfig. Still the performance is the same.
> >>
> >
> > Please don't top-post.  It makes it annoying to respond to what's
> > already been said in the thread.
> >
> > FYI, there is no kernel option to balance IRQs automatically across
> > CPUs, it's done via userspace using irqbalanced, or via explicit
> > settings in /proc/irq/$IRQ/smp_affinity_list.
> I checked /proc/interrupts and verified all cores are getting interrupts.
> also CONFIG_IRQ_ALL_CPUS=y
> >

OK, so CONFIG_IRQ_ALL_CPUS=y means your running on PPC then.

FYI, I see the following bugfix for this logic that does not appear to
be included in v3.8.x code:

powerpc/mpic: Fix irq distribution problem when MPIC_SINGLE_DEST_CPU
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=e242114afff0a41550e174cd787cdbafd34625de

Not sure if this applies to your setup or not..

Looking at the code in arch/powerpc/sysdev/mpic:mpic_setup_this_cpu():

        /* let the mpic know we want intrs. default affinity is 0xffffffff
         * until changed via /proc. That's how it's done on x86. If we want
         * it differently, then we should make sure we also change the default
         * values of irq_desc[].affinity in irq.c.
         */
        if (distribute_irqs && !(mpic->flags & MPIC_SINGLE_DEST_CPU)) {
                for (i = 0; i < mpic->num_sources ; i++)
                        mpic_irq_write(i, MPIC_INFO(IRQ_DESTINATION),
                                mpic_irq_read(i, MPIC_INFO(IRQ_DESTINATION)) | msk);
        }

seems to indicate the default affinity gets set to 0xffffffff, which is
also not what you want for best results.

You should consider explicitly setting the IRQ affinity of your NIC +
storage HBAs to individual CPUs, instead of hardware interrupts randomly
bouncing across all CPUs on the system.

If you give me the /proc/interrupts output, I'll happily give an example
of how this should look.

> > So, I'd still like to see your /proc/interrupts output in order to
> > determine the distribution.
> >
> > Some top and perf top output would be useful as well to see what
> > processes and functions are running.
> >

Again, please provide both the /proc/interrupts + top output, and
preferably perf top output as well.

It's very helpful to get this output in order to get an idea of what's
actually going on.

> >> The emulate_write_cache=1 did not help performance either.
> >>
> >> How does LIO/iscsi handle multi-thread on multi-core system?
> >>
> >
> > As explained below:
> >
> > So target_core_mod uses a bounded workqueue for it's I/O completion,
> > which means that process context is provided on the same CPU for which
> > the hardware interrupt was generated in order to benefit from cache
> > locality effects.  If all of the hardware interrupts for the entire
> > system are firing only on CPU0, then only kworkerd/0 is used to provide
> > process context for queuing the response to the fabric drivers.
> >
> > Also, I can confirm with v3.11 code that iscsi-target is running dual
> > port ixgbe line rate (~20 Gb/sec) with large block reads/writes to PCIe
> > flash, and to ramdisk_mcp backends.
> >
> > So that said, I'll need more information about your setup to determine
> > what's going on.
> 
> For iSCSI, all cpus are equally busy(verified by 'top') and all cores
> are getting the same number of interrupts.

I'm confused now.  You said earlier that on large block READs, that CPU0
was at 100% CPU usage, right..?  What has changed..?

Please send along the information requested in order to see what's going
on, instead of making me guess over and over again.

> 
> Sigh, I have to stick with 3.8.x kernel for now, this is a non-x86 box
> so it's hard to upgrade the kernel due to various dependencies.
> 

There is nothing I'm aware of between the v3.8.x and v3.11.x code that
would effect large block performance.

--nab

--
To unsubscribe from this list: send the line "unsubscribe target-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux SCSI]     [Kernel Newbies]     [Linux SCSI Target Infrastructure]     [Share Photos]     [IDE]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux ATA RAID]     [Linux IIO]     [Device Mapper]

  Powered by Linux