Re: [PATCH 4/5] tree-walk: unroll get_mode since loop boundaries are well-known

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sun, Apr 3, 2011 at 12:28 AM, Dan McGee <dpmcgee@xxxxxxxxx> wrote:
> On Sat, Apr 2, 2011 at 4:28 AM, Nguyen Thai Ngoc Duy <pclouds@xxxxxxxxx> wrote:
>> On Thu, Mar 31, 2011 at 8:38 AM, Dan McGee <dpmcgee@xxxxxxxxx> wrote:
>>> We know our mode entry in our tree objects should be 5 or 6 characters
>>> long. This change both enforces this fact and also unrolls the parsing
>>> of the information giving the compiler more room for optimization of the
>>> operations.
>>
>> I'm skeptical. Did you measure signficant gain after this patch? I
>> looked at asm output with -O3 and failed to see the compiler doing
>> anything fancy. Perhaps it's because I'm on x86 with quite small
>> register set.
>
> I'm on x86_64 and was just using -O2; -O3 produces the same output
> actually. You can see it below. I had taken a look at this before I
> submitted, and noticed a few things:
> 1. We do use multiple registers now since we aren't constrained to a loop.
> 2. movzbl (for the string parts) and cmb instructions tend to get
> clustered first.
> 3. mozbl (for the mode shifting) and leal instructions tend to get
> clustered later.
> 4. The normal case now involves no conditional jumps until the ' '
> (space) comparison.
>
> Call these "trivial", but on my worst case operation times went from
> (shown below) 27.41 secs to 26.49 secs. Considering this operation is
> called 530,588,868 times (that is not a typo) during this operation,
> every saved instruction or non-missed branch prediction does seem to
> make a difference.

If it makes it better for you, I'm good.
-- 
Duy
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]