On Mon, Mar 06, 2017 at 10:17:35AM -0800, Linus Torvalds wrote: > On Mon, Mar 6, 2017 at 8:43 AM, Luc Van Oostenryck > <luc.vanoostenryck@xxxxxxxxx> wrote: > > > > With an example: > > == C code == > > void *foo(int *p) { return p + 5; } > > > > == linearized code == > > foo: > > .L0: > > <entry-point> > > add.64 %r2 <- %arg1, $20 > > cast.64 %r3 <- (64) %r2 > > ret.64 %r3 > > This is correct. Interestingly, the following code is linearized a bit differently: == C code == void *foo(int *p) { return p += 5; } == linearized code == foo: .L0: <entry-point> cast.64 %r98 <- (64) %arg1 add.64 %r99 <- %r98, $20 ptrcast.64 %r100 <- (64) %r99 cast.64 %r101 <- (64) %r100 ret.64 %r101 Where the type of %r98 & %r99 is 'unsigned long' (or more probably size_t). This is then correctly LLVMized as: define i8* @foo(i32* %ARG1) { L18: %R98 = ptrtoint i32* %ARG1 to i64 %R99 = add i64 %R98, 20 %R100 = inttoptr i64 %R99 to i32* %R101 = bitcast i32* %R100 to i8* ret i8* %R101 } > > == LLVM code from sparse-llvm == > > > > define i8* @foo(i32* %ARG1) { > > L0: > > %0 = getelementptr i32, i32* %ARG1, inttoptr (i64 20 to i32*) > > %R3 = bitcast i32* %0 to i8* > > This is garbage, I'm afraid. > > When sparse does the "add 20 to pointer", it adds the *byte offset* 20 > to the pointer. The LLVM module should not use "getelementptr" for > this, because it's not element #20, it's the element at offset 20. It's clear that there is a semantic gap between sparse's & LLVM's IR. > I think you're supposed to either use "uglygep" with the base pointer > cast to a simple address-unit pointer (ie unsigned char). > > Or not use GEP at all. > > Linus I bet we'll have others problems with GEP. Luc -- To unsubscribe from this list: send the line "unsubscribe linux-sparse" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html