Re: [PATCH] xHCI: Remove duplicate functions

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Oct 28, 2011 at 11:11:10AM -0400, Alan Stern wrote:
> On Fri, 28 Oct 2011, Sarah Sharp wrote:
> 
> > On Fri, Oct 28, 2011 at 05:05:58PM +0800, Andiry Xu wrote:
> > > There're two static xhci_urb_to_transfer_ring() implementations in xhci.c
> > > and xhci-ring.c, and they do exactly the same things.
> > > 
> > > Remove one of them to get rid of duplicate codes.
> > 
> > NAK.  There are the same functions in two separate files for performance
> > reasons.  This function is used very often, and allowing it to be static
> > lets the compiler optimize it.
> 
> Optimize it how?  The only optimization for static functions that I'm 
> aware of is to put them inline.  But this function is sufficiently long 
> and complicated that putting in inline would yield worse object code, 
> not better.
> 
> Have you measured the performance difference?

>From the original commit that created the copies:

commit 021bff9179c2d19c26599dc3e9134d04cf1c8a3a
Author: Sarah Sharp <sarah.a.sharp@xxxxxxxxxxxxxxx>
Date:   Thu Jul 29 22:12:20 2010 -0700

    USB: xhci: Performance - move functions that find ep ring.
    
    I've been using perf to measure the top symbols while transferring 1GB of data
    on a USB 3.0 drive with dd.  This is using the raw disk with /dev/sdb, with a
    block size of 1K.
    
    During performance testing, the top symbol was xhci_triad_to_transfer_ring(), a
    function that should return immediately if streams are not enabled for an
    endpoint.  It turned out that the functions to find the endpoint ring was
    defined in xhci-mem.c and used in xhci-ring.c and xhci-hcd.c.  I moved a copy of
    xhci_triad_to_transfer_ring() and xhci_urb_to_transfer_ring() into xhci-ring.c
    and declared them static.  I also made a static version of
    xhci_urb_to_transfer_ring() in xhci.c.
    
    This improved throughput on a 1GB read of the raw disk with dd from
    186MB/s to 195MB/s, and perf reported sampling the xhci_triad_to_transfer_ring()
    0.06% of the time, rather than 9.26% of the time.
    
    Signed-off-by: Sarah Sharp <sarah.a.sharp@xxxxxxxxxxxxxxx>
    Signed-off-by: Greg Kroah-Hartman <gregkh@xxxxxxx>


It's possible that the compiler is inlining the function automatically
or doing some other sort of optimization that wasn't possible when the
function was defined in a different file.

Sarah Sharp
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Media]     [Linux Input]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Old Linux USB Devel Archive]

  Powered by Linux