Hi Dale, Thanks for the reply. The information that you provided really helped me understand gcc. I did not reply your email because I wanted to read some of the stuff that you had mentioned. However, I am still little bit confused when I looked into the i386.md file inside the gcc-4.1.0/gcc/config/i386 directory (I am using gcc 4.1.0 version). I assumed the define_insn block that you had mentioned in your email is a sample. In the i386.md file, there are so many define_insn blocks corresponding to a call instructions. I looked into the file, read online documents, and googled a lot, but could not find the define_insn pattern corresponding to the indirect call instructions. Is there some way I could know which "define_insn" pattern needs to be modified for my project? So far your information has been very useful. Thanks, Abhinav --- On Fri, 20/11/09, Dale Reese <dreese@xxxxxxxxx> wrote: > From: Dale Reese <dreese@xxxxxxxxx> > Subject: RE: Binary rewriting of indirect function calls > To: abhinavs_iitkgp@xxxxxxxxxxx > Cc: gcc-help@xxxxxxxxxxx > Date: Friday, 20 November, 2009, 5:04 AM > > > -----Original Message----- > > From: gcc-help-owner@xxxxxxxxxxx > > [mailto:gcc-help-owner@xxxxxxxxxxx] > > On Behalf Of Abhinav Srivastava > > Sent: Wednesday, November 18, 2009 7:05 PM > > To: gcc-help@xxxxxxxxxxx > > Subject: Binary rewriting of indirect function calls > > > > >>Hi all, > > > > >>In one of my projects, I am trying to do > binary > > rewriting of Linux kernel on an x86-32 machine. To be > more > > precise, I am actually targeting call instructions, > and the > > goal is to re-write in-memory call instructions with > the > > address of a different call site (trampoline). > > > > >>The main problem that I am facing is related > to > > indirect function calls. Most of the indirect call > > instructions in the kernel code are of 2 or 3 bytes, > and > > modifying these call instructions with direct call > > instructions (5 bytes) seems impossible to me. > > > > >>I was thinking of adding some "NOP" > instructions > > after each indirect call in the kernel code so that I > could > > replace an indirect call instruction with a direct > one. To > > achieve that, instead of modifying the source code of > the > > kernel and adding asm("nop"), I would like to do this > at the > > compiler level. > > > > >>Related to this, I have two questions: > > > > >>1) what are the ways in which an indirect > call > > instruction can be overwritten by a direct call > instruction > > inside the memory? > > > > >>2) Is it possible to modify gcc in such a way > that > > it generates some "NOP" instructions after each > indirect > > function calls? > > > > >>I am completely new to this thing, and any > help, > > ideas, and code pointers would be highly appreciated. > > > > >>Thanks, > > >>Abhinav > > > > > > > > Is there a specific reason you are working at > modifying > > binaries? > > > > If you have access to the source you can use gcc to > insert > > the trampolines for you. > > > > >>2) Is it possible to modify gcc in such a way > that > > it generates some "NOP" instructions after each > indirect > > function calls? > > You can modify the machine description for your > > target architecture in gcc to add no ops to the > indirect > > call instructions. > > > > If you want to use gcc to add trampolines... > > > > There are a few ways to go about this. A quick way > would be > > to utilize the -finstrument-functions option on gcc. > This > > actually inserts a trampoline into all functions. It > goes > > about it a little differently then what you have > described > > as your desired approach. Instead of inserting the > > trampoline at the call instruction, the call to the > > trampoline is inserted after the entry of a function. > > > > > Here is a section from .text of an objdump > > > > 000000000040e6e0 <makeargv>: > > 40e6e0: 55 > > > > > > push %rbp > > 40e6e1: bf e0 e6 40 00 > > mov > > $0x40e6e0,%edi > > 40e6e6: bd a0 f9 61 00 > > mov > > $0x61f9a0,%ebp > > 40e6eb: 53 > > > > > > push %rbx > > 40e6ec: 48 83 ec 08 > > sub > > $0x8,%rsp > > 40e6f0: 48 8b 74 24 18 > > mov > > 0x18(%rsp),%rsi > > 40e6f5: e8 76 21 00 00 > > callq > > 410870 <__cyg_profile_func_enter> > > > > > > Notice what happens here. Once in the function > makeargv the > > code saves the stack registers then calls the > function > > "__cyg_profile_func_enter". When the > > "__cyg_profile_func_enter is done executing it will > return > > to the line just after the call to it. > > > > By using this method you would eliminate the need to > trap > > for different types of calls, Direct, indirect, etc. > The > > reason is that instead of modifying the call you are > > modifying the functions themselves. > > > > If this method looks like it will work then read up > on > > finstrument-functions. > > > > > > > > > > Another approach would be to us the trampoline > mechanism > > provided by gcc. To use this you have to make some > changes > > to the machine description for the back end of gcc > and > > recompile. To read more on this go here > http://gcc.gnu.org/onlinedocs/gccint/Trampolines.html. > > > > Hope this helps, > > Dale Reese > > > > > > > >Hi Dale, > > >Thanks for your detailed reply. I will look into gcc > trampolines. > >However, at this point of time, I think NOP based > solution would work best > for me. > > >As you mentioned this solution needs gcc modifications > to > >generate no ops after the indirect call instructions, > could > >you please let me know where these modifications should > be done > >and how? Since I will be modifying gcc's code for the > first time, > >any code pointer would be very help. > > >Is this modification needs to be done in a certain way? > > >Any code examples for such modifications would be > useful too. > > >I look forward hearing from you soon. > > >Thanks, > >Abhinav > > > Abhinav, > You will have to modify the machine description for the > target architecture > that you are working with. In your case I think it was > x86_32. > > I will describe a relatively easy way to get what you want. > > > The file you will be working with is "i386.md". If you > don't have any > experience building gcc you might want to practice. You can > find the file in > "/gcc-4.x.x/gcc/config/i386/" > > This file holds the rules for the back end of gcc to > convert rtl to machine > code. > What you will want to look for are the rules for building > call instructions. > It would be beneficial to you to read GNU compiler > Collection Internals. In > particular > chapter Description we Description will be interesting. > This chapter will > explain "define_insn". > > In the i386 machine description there are several types of > define_insn > blocks for calls defined. > You will have to find which one is used for indirect calls. > I will show you > an example of how to modify > A define_insn block to affect the assembly output. > > Below is a section from the i386.md file. > > (define_insn "*call_value_0" > [(set (match_operand 0 "" "") > (call (mem:QI (match_operand:SI 1 > "constant_call_address_operand" > "")) > (match_operand:SI 2 > "" "")))] > "!TARGET_64BIT" > { > if (SIBLING_CALL_P (insn)) > return "jmp\t%P1"; > else > return "call\t%P1"; > } > [(set_attr "type" "callv")]) > > (define_insn "*call_value_0_rex64" > [(set (match_operand 0 "" "") > (call (mem:QI (match_operand:DI 1 > "constant_call_address_operand" > "")) > (match_operand:DI 2 > "const_int_operand" "")))] > "TARGET_64BIT" > { > if (SIBLING_CALL_P (insn)) > return "jmp\t%P1"; > else > return "call\t%P1"; > } > > Notice the return key word. This is where you will want to > modify the code. > > Old code --> return "call\t%P1"; > New code --> return "call\t%P1\n\tnop"; > > What this will do is ever time this rule is used to > generate an assembly > call instruction > It will generate a call with a nop on the next line. > > > After you make these changes you and have compiled gcc you > will be able to > use it as normal. > It should not affect program operations aside form an added > nop instruction > after each call instruction. > Note that the nop instruction will not be executed till > after the call > returns, but it should give you room in the binary > To add your own instruction. > > > Hope this helps. > Dale Reese > > > The INTERNET now has a personality. YOURS! See your Yahoo! Homepage. http://in.yahoo.com/