RE: Binary rewriting of indirect function calls

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Dale,

Thanks for the reply. The information that you provided really helped me understand gcc. I did not reply your email because I wanted to read some of the stuff that you had mentioned. 

However, I am still little bit confused when I looked into the i386.md file inside the gcc-4.1.0/gcc/config/i386 directory (I am using gcc 4.1.0 version). I assumed the define_insn block that you had mentioned in your email is a sample. In the i386.md file, there are so many define_insn blocks corresponding to a call instructions. I looked into the file, read online documents, and googled a lot, but could not find the define_insn pattern corresponding to the indirect call instructions.

Is there some way I could know which "define_insn" pattern needs to be modified for my project? So far your information has been very useful.

Thanks,
Abhinav


 

--- On Fri, 20/11/09, Dale Reese <dreese@xxxxxxxxx> wrote:

> From: Dale Reese <dreese@xxxxxxxxx>
> Subject: RE: Binary rewriting of indirect function calls
> To: abhinavs_iitkgp@xxxxxxxxxxx
> Cc: gcc-help@xxxxxxxxxxx
> Date: Friday, 20 November, 2009, 5:04 AM
> 
> > -----Original Message-----
> > From: gcc-help-owner@xxxxxxxxxxx
> > [mailto:gcc-help-owner@xxxxxxxxxxx]
> > On Behalf Of Abhinav Srivastava
> > Sent: Wednesday, November 18, 2009 7:05 PM
> > To: gcc-help@xxxxxxxxxxx
> > Subject: Binary rewriting of indirect function calls
> > 
> > >>Hi all,
> > 
> > >>In one of my projects, I am trying to do
> binary
> > rewriting of Linux kernel on an x86-32 machine. To be
> more
> > precise, I am actually targeting call instructions,
> and the
> > goal is to re-write in-memory call instructions with
> the
> > address of a different call site (trampoline).
> > 
> > >>The main problem that I am facing is related
> to
> > indirect function calls. Most of the indirect call
> > instructions in the kernel code are of 2 or 3 bytes,
> and
> > modifying these call instructions with direct call
> > instructions (5 bytes) seems impossible to me.
> > 
> > >>I was thinking of adding some "NOP"
> instructions
> > after each indirect call in the kernel code so that I
> could
> > replace an indirect call instruction with a direct
> one. To
> > achieve that, instead of modifying the source code of
> the
> > kernel and adding asm("nop"), I would like to do this
> at the
> > compiler level.
> > 
> > >>Related to this, I have two questions: 
> > 
> > >>1) what are the ways in which an indirect
> call
> > instruction can be overwritten by a direct call
> instruction
> > inside the memory?
> > 
> > >>2) Is it possible to modify gcc in such a way
> that
> > it generates some "NOP" instructions after each
> indirect
> > function calls? 
> > 
> > >>I am completely new to this thing, and any
> help,
> > ideas, and code pointers would be highly appreciated.
> > 
> > >>Thanks,
> > >>Abhinav
> > 
> > 
> > 
> > Is there a specific reason you are working at
> modifying
> > binaries? 
> > 
> > If you have access to the source you can use gcc to
> insert
> > the trampolines for you. 
> > 
> > >>2) Is it possible to modify gcc in such a way
> that
> > it generates some "NOP" instructions after each
> indirect
> > function calls?
> >   You can modify the machine description for your
> > target architecture in gcc to add no ops to the
> indirect
> > call instructions. 
> > 
> > If you want to use gcc to add trampolines...
> > 
> > There are a few ways to go about this. A quick way
> would be
> > to utilize the -finstrument-functions option on gcc.
> This
> > actually inserts a trampoline into all functions. It
> goes
> > about it a little differently then what you have
> described
> > as your desired approach. Instead of inserting the
> > trampoline at the call instruction, the call to the
> > trampoline is inserted after the entry of a function.
> 
> > 
> > Here is a section from .text of an objdump
> > 
> > 000000000040e6e0 <makeargv>:
> >   40e6e0:    55     
> >          
> >       
> > push   %rbp
> >   40e6e1:    bf e0 e6 40 00 
> >          mov   
> > $0x40e6e0,%edi
> >   40e6e6:    bd a0 f9 61 00 
> >          mov   
> > $0x61f9a0,%ebp
> >   40e6eb:    53     
> >          
> >       
> > push   %rbx
> >   40e6ec:    48 83 ec 08   
> >           sub   
> > $0x8,%rsp
> >   40e6f0:    48 8b 74 24 18 
> >          mov   
> > 0x18(%rsp),%rsi
> >   40e6f5:    e8 76 21 00 00 
> >          callq 
> > 410870 <__cyg_profile_func_enter>
> > 
> > 
> > Notice what happens here. Once in the function
> makeargv the
> > code saves the stack registers then calls the
> function
> > "__cyg_profile_func_enter". When the
> > "__cyg_profile_func_enter is done executing it will
> return
> > to the line just after the call to it. 
> > 
> > By using this method you would eliminate the need to
> trap
> > for different types of calls, Direct, indirect, etc.
> The
> > reason is that instead of modifying the call you are
> > modifying the functions themselves. 
> > 
> > If this method looks like it will work then read up
> on
> > finstrument-functions.
> > 
> > 
> > 
> > 
> > Another approach would be to us the trampoline
> mechanism
> > provided by gcc. To use this you have to make some
> changes
> > to the machine description for the back end of gcc
> and
> > recompile. To read more on this go here
> http://gcc.gnu.org/onlinedocs/gccint/Trampolines.html.
> > 
> > Hope this helps,
> > Dale Reese 
> > 
> > 
> 
> 
> >Hi Dale,
> 
> >Thanks for your detailed reply. I will look into gcc
> trampolines. 
> >However, at this point of time, I think NOP based
> solution would work best
> for me. 
> 
> >As you mentioned this solution needs gcc modifications
> to 
> >generate no ops after the indirect call instructions,
> could 
> >you please let me know where these modifications should
> be done 
> >and how? Since I will be modifying gcc's code for the
> first time, 
> >any code pointer would be very help.
> 
> >Is this modification needs to be done in a certain way?
> 
> >Any code examples for such modifications would be
> useful too.
> 
> >I look forward hearing from you soon.
> 
> >Thanks,
> >Abhinav
> 
> 
> Abhinav,
> You will have to modify the machine description for the
> target architecture 
> that you are working with. In your case I think it was
> x86_32. 
> 
> I will describe a relatively easy way to get what you want.
> 
> 
> The file you will be working with is "i386.md". If you
> don't have any 
> experience building gcc you might want to practice. You can
> find the file in
> "/gcc-4.x.x/gcc/config/i386/"
> 
> This file holds the rules for the back end of gcc to
> convert rtl to machine
> code.
> What you will want to look for are the rules for building
> call instructions.
> It would be beneficial to you to read GNU compiler
> Collection Internals. In
> particular 
> chapter Description we Description will be interesting.
> This chapter will
> explain "define_insn".
> 
> In the i386 machine description there are several types of
> define_insn
> blocks for calls defined. 
> You will have to find which one is used for indirect calls.
> I will show you
> an example of how to modify 
> A define_insn block to affect the assembly output.
> 
> Below is a section from the i386.md file. 
> 
> (define_insn "*call_value_0"
>   [(set (match_operand 0 "" "")
>     (call (mem:QI (match_operand:SI 1
> "constant_call_address_operand"
> ""))
>           (match_operand:SI 2
> "" "")))]
>   "!TARGET_64BIT"
> {
>   if (SIBLING_CALL_P (insn))
>     return "jmp\t%P1";
>   else
>     return "call\t%P1";
> }
>   [(set_attr "type" "callv")])
> 
> (define_insn "*call_value_0_rex64"
>   [(set (match_operand 0 "" "")
>     (call (mem:QI (match_operand:DI 1
> "constant_call_address_operand"
> ""))
>           (match_operand:DI 2
> "const_int_operand" "")))]
>   "TARGET_64BIT"
> {
>   if (SIBLING_CALL_P (insn))
>     return "jmp\t%P1";
>   else
>     return "call\t%P1";
> }
> 
> Notice the return key word. This is where you will want to
> modify the code.
> 
> Old code --> return "call\t%P1";
> New code --> return "call\t%P1\n\tnop";
> 
> What this will do is ever time this rule is used to
> generate an assembly
> call instruction 
> It will generate a call with a nop on the next line.
> 
> 
> After you make these changes you and have compiled gcc you
> will be able to
> use it as normal. 
> It should not affect program operations aside form an added
> nop instruction
> after each call instruction. 
> Note that the nop instruction will not be executed till
> after the call
> returns, but it should give you room in the binary 
> To add your own instruction. 
> 
> 
> Hope this helps.
> Dale Reese
> 
> 
>



      The INTERNET now has a personality. YOURS! See your Yahoo! Homepage. http://in.yahoo.com/


[Index of Archives]     [Linux C Programming]     [Linux Kernel]     [eCos]     [Fedora Development]     [Fedora Announce]     [Autoconf]     [The DWARVES Debugging Tools]     [Yosemite Campsites]     [Yosemite News]     [Linux GCC]

  Powered by Linux