On Wednesday 19 November 2008 02:58, Linus Torvalds wrote: > On Tue, 18 Nov 2008, Nick Piggin wrote: > > On Tuesday 18 November 2008 07:58, David Miller wrote: > > > From: Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> > > > > > > > Ok. It could easily be something like a cache footprint issue. And > > > > while I don't know my sparc cpu's very well, I think the > > > > Ultrasparc-IIIi is super- scalar but does no out-of-order and > > > > speculation, no? > > > > > > I does only very simple speculation, but you're description is > > > accurate. > > > > Surely it would do branch prediction, but maybe not indirect branch? > > That would be "branch target prediction" (and a BTB - "Branch Target > Buffer" to hold it), and no, I don't think Sparc does that. You can > certainly do it for in-order machines too, but I think it's fairly rare. > > It's sufficiently different from the regular "pick up the address from the > static instruction stream, and also yank the kill-chain on mispredicted > direction" to be real work to do. Unlike a compare or test instruction, > it's not at all likely that you can resolve the final address in just a > single pipeline stage, and without that, it's usually too late to yank the > kill-chain. > > (And perhaps equally importantly, indirect branches are relatively rare on > old-style Unix benchmarks - ie SpecInt/FP - or in databases. So it's not > something that Sparc would necessarily have spent the effort on.) > > There is obviously one very special indirect jump: "ret". That's the one > that is common, and that tends to have a special branch target buffer that > is a pure stack. And for that, there is usually a special branch target > register that needs to be set up 'x' cycles before the ret in order to > avoid the stall (then the predition is checking that register against the > branch target stack, which is somewhat akin to a regular conditional > branch comparison). > > So I strongly suspect that an indirect (non-ret) branch flushes the > pipeline on sparc. It is possible that there is a "prepare to jump" > instruction that prepares the indirect branch stack (kind of a "push > prediction information"). I suspect Java sees a lot more indirect > branches than traditional Unix loads, so maybe Sun did do that. Probably true. OTOH, I've seen indirect branches get compiled to direct branches or the common-case special cased into a direct branch if (object->fn == default_object_fn) default_object_fn(); That might be an easy way to test suspicions about CPU scheduler slowdowns... (adding a likely() there, and using likely profiling would help ensure you got the defualt case right). -- To unsubscribe from this list: send the line "unsubscribe kernel-testers" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html