On Tue, Apr 25, 2017 at 10:49 PM, Luc Van Oostenryck <luc.vanoostenryck@xxxxxxxxx> wrote: > Roughly, once you begin to play with code generation, something like > OP_SYMADDR is an operation that you will really do. > Depending on the relocation, it can even be something relatively costly: > I'm thinking static code on a 64bit machine where you can only generate > 16bit constants, others cases may be not at all cheaper. > So it's something that soon or later will need to be exposed and > doing CSE on the address is good. I see your point. Let me rephrase your claim. Sparse assume getting the symbol address is trivial. But on some architecture that is non trivial to generate symbol address. Typical case would be, the machine instruction size is smaller than the address size. It will take a few machine instruction to generate a symbol address. I agree with this part. Now the second part of the claim is that, it would be better to left the OP_SYMADDR untouched and let CSE to remove it. That part I disagree. The reason is that, currently CSE operate on the same basic block. It only eliminate instruction but it does not relocate instructions. A very common case is that, the symbol address was referenced in different basic blocks. extern int a, d; if (...) a = d; else if (...) a = d + 2; CSE would not be able to simply remove the OP_SYMADDR for "a", because they are not in the same basic block. The best result should be, for all the usage of that symbol in a function, find the closest common parent basic block and put the OP_SYMADDR there. Because getting symbol address is used very often. Let CSE go through a hash to re-discover that all symbol address can be combined is costly and not necessary. And CSE does not cover all the case commit 962279e8 covers. > If we would have kept the OP_SYMADRR and doing CSE on it. > But for now we have: > foo: > .L0: > <entry-point> > load.32 %r2 <- 0[a] > add.32 %r3 <- %r2, $1 > store.32 %r3 -> 0[a] > ret > > whose translation would be: > movw r3, #:lower16:a > movt r3, #:upper16:a > ldr r2, [r3] > add r2, r2, #1 > ! movw r3, #:lower16:a > ! movt r3, #:upper16:a That is because you assume every time "a" was referenced you need to redo the generate of 32 bit address "a". That is not necessary true. In sparse IR, "a" already translate into pseudo, which has the user chain. It is relative simple for the back end to find the best place to insert OP_SYMADDR. The end of the day, it is the back end knows the machine register allocation and reorder of the instruction if necessary. So the proper solution shouldn't be let CSE to discover OP_SYMADDR yet another time. It should be find the best place to insert OP_SYMADDR. Chris -- To unsubscribe from this list: send the line "unsubscribe linux-sparse" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html