Re: [PATCH v3] MIPS: R12000: Enable branch prediction global history

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 06/03/2015 04:21, Ralf Baechle wrote:
> On Tue, Jun 02, 2015 at 06:21:33PM -0400, Joshua Kinard wrote:
> 
>> From: Joshua Kinard <kumba@xxxxxxxxxx>
>>
>> The R12000 added a new feature to enhance branch prediction called
>> "global history".  Per the Vr10000 Series User Manual (U10278EJ4V0UM),
>> Coprocessor 0, Diagnostic Register (22):
>>
>> """
>> If bit 26 is set, branch prediction uses all eight bits of the global
>> history register.  If bit 26 is not set, then bits 25:23 specify a count
>> of the number of bits of global history to be used. Thus if bits 26:23
>> are all zero, global history is disabled.
>>
>> The global history contains a record of the taken/not-taken status of
>> recently executed branches, and when used is XOR'ed with the PC of a
>> branch being predicted to produce a hashed value for indexing the BPT.
>> Some programs with small "working set of conditional branches" benefit
>> significantly from the use of such hashing, some see slight performance
>> degradation.
>> """
>>
>> This patch enables global history on R12000 CPUs and up by setting bit
>> 26 in the branch prediction diagnostic register (CP0 $22) to '1'.  Bits
>> 25:23 are left alone so that all eight bits of the global history
>> register are available for branch prediction.
> 
> Will apply but could you also submit a patch to set cpu_has_bp_ghist to
> 0/1 as applicable in all cpu-feature-overrides.h?

I can, though at that point, the R10000 Kconfig item needs to be split to
differentiate between R10000 and R12000/R14000/R16000.  I sent a patch in to do
that a few weeks ago, but it was rejected.  Can you outline your specific
issues with it and I'll re-submit it, then the 'cpu_has_bp_ghist' define can be
'0' for R10000's and '1' for R12K-R16K?

That'll also set things up for the potential discovery of bits specific to
R14K/R16K that may be useful, but aren't known/understood just.


> Also the manual suggests this CPU feature may not always be neneficial
> for performance so I'm wondering if we should add a way to modify it
> at runtime.

I thought about this, too.  It'd also allow for R12K+ options to control the
Disable Branch Target Address Cache (BTAC, Bit 27) and the Disable Branch
Return Cache (Bit 22).  For global history, I just set Bit 26 so all of the
ghistory bits are available, but even this could become a Kconfig item to
control Bits 25:23.  Would probably require some benchmarking to see what the
effects of this are, but the entry in the manual suggests that the benefits
outweigh the penalties in the end.


> I'm curious, have you checked the default setting of the global history
> on kernel entry?

Yup, it's disabled by default:

[    0.000000] DEBUG: CPU0: c0_diag #1: 0x000400001030c000
[    0.000000] DEBUG: CPU0: c0_diag #2: 0x0004000014148000
[    7.798066] DEBUG: CPU1: c0_diag #1: 0x00000000103c8000
[    7.798092] DEBUG: CPU1: c0_diag #2: 0x0000000014144000


              I                     B     G       -BRC-   -----------BP----------
              T                     S   B H  H  D | | |   M  S                   
              L                     I   T I  I  B | | |   o  t         I         
              B                     d   A S  S  R | | | M d  a         d       O 
      0       M           0         x   C T  T  C V W H P e  t  **     x     0 p 
xxxxxxxxxxxx xxxx xxxxxxxxxxxxxxxx xxxx x x xxx x x x x x xx xx xx xxxxxxxxx x xx
---------------------------------------------------------------------------------
000000000000 0100 0000000000000000 0001 0 0 000 0 1 1 0 0 00 11 00 000000000 0 00  CPU0 Before
000000000000 0100 0000000000000000 0001 0 1 000 0 0 1 0 1 00 10 00 000000000 0 00  CPU0 After
000000000000 0000 0000000000000000 0001 0 0 000 0 1 1 1 1 00 10 00 000000000 0 00  CPU1 Before
000000000000 0000 0000000000000000 0001 0 1 000 0 0 1 0 1 00 01 00 000000000 0 00  CPU1 After
---------------------------------------------------------------------------------
     12       4          16         4   1 1  3  1 1 1 1 1 2  2  2      9     1 2

** R12000 and up: Upper-two bits of BP-Idx.


--J





[Index of Archives]     [Linux MIPS Home]     [LKML Archive]     [Linux ARM Kernel]     [Linux ARM]     [Linux]     [Git]     [Yosemite News]     [Linux SCSI]     [Linux Hams]

  Powered by Linux