David VomLehn wrote:
On Tue, May 12, 2009 at 03:45:16PM -0700, David Daney wrote:
If the TLB refill handler is too bit and needs to be split, there is
no need to branch around the split if the branch target would be an
eret. Since the eret returns from the handler, control flow never
passes it. A branch to an eret is equivalent to the eret itself.
Signed-off-by: David Daney <ddaney@xxxxxxxxxxxxxxxxxx>
---
arch/mips/mm/tlbex.c | 50 +++++++++++++++++++++++++++++++++-----------------
1 files changed, 33 insertions(+), 17 deletions(-)
diff --git a/arch/mips/mm/tlbex.c b/arch/mips/mm/tlbex.c
index 4dc4f3e..ffa7104 100644
--- a/arch/mips/mm/tlbex.c
+++ b/arch/mips/mm/tlbex.c
@@ -723,28 +731,36 @@ static void __cpuinit build_r4000_tlb_refill_handler(void)
uasm_copy_handler(relocs, labels, tlb_handler, p, f);
final_len = p - tlb_handler;
} else {
- u32 *split = tlb_handler + 30;
-
- /*
- * Find the split point.
- */
- if (uasm_insn_has_bdelay(relocs, split - 1))
- split--;
+ u32 *split;
+ if (split_on_eret) {
+ split = tlb_handler + 32;
+ } else {
+ split = tlb_handler + 30;
It would be a shame to pass up an opportunity to eliminate some of the
pile of magic constants we have in the MIPS tree. Given that the MIPS
documentation describes the size of the TLB Refill handler as 0x80 bytes,
we could add something like:
That would be a different patch according to the one patch per problem rule.
/* Maximum # of instructions in exception vector for TLB Refill handler */
#define MIPS64_TLB_REFILL_OPS (0x80 / sizeof(union mips_instruction))
and could change the last few lines of the code above to, for example:
if (split_on_eret) {
split = tlb_handler + MIPS64_TLB_REFILL_OPS;
} else {
split = tlb_handler + MIPS64_TLB_REFILL_OPS - 2;
(you'd need to include asm/inst.h to get union mips_instruction defined)
It is certainly possible to do something like that, but it isn't clear
to me that it makes it any easier to understand.
+
+ /*
+ * Find the split point.
+ */
+ if (uasm_insn_has_bdelay(relocs, split - 1))
+ split--;
+ }
The code itself makes sense. Does this case actually happen much, or was
this just an itch?
For my CPU it was happening 100% of the time when I add the soon to be
submitted hugeTLBfs support patch. Although I have not measured it,
this code is so hot that keeping the normal case fitting on a single
cache line should be a big win.
David Daney