Re: Guidance on Decompiling GCC 11.4.0 Optimized Code

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 





On 11/03/2025 15:31, Hans Åberg wrote:

On 11 Mar 2025, at 12:04, Jonathan Wakely via Gcc-help <gcc-help@xxxxxxxxxxx> wrote:

On Tue, 11 Mar 2025 at 09:01, David Brown via Gcc-help
<gcc-help@xxxxxxxxxxx> wrote:

On 11/03/2025 07:26, manoj alladawar via Gcc-help wrote:
Dear GCC Community,

I hope this message finds you well.

I am currently working on a Debian-based system running Ubuntu 22.04 and
using GCC version 11.4.0. I have compiled a C source file using GCC with
the -O2 optimization flag, resulting in an optimized binary.

I am now interested in generating high-level C code from this optimized
binary for analysis and understanding. Could you kindly advise me on which
decompiler would be most suitable for this purpose? Specifically, I would
appreciate recommendations for decompilers that perform effectively on
GCC-based optimized code.

Your guidance and insights would be greatly appreciated.

Thank you for your time and support. Kind regards,
Manoj Alladawar


I am not sure what you are trying to get at here.

Suppose we have the function :

        int foo(int x) {
            int y = 0;
            for (int i = 0; i < 5; i++) {
                y += x;
            }
            return y;
        }

When optimising, gcc will just multiply "x" by 5.

Are you hoping to be able to run gcc on this, then decompile that output
and get a result :

        int foo(int x) {
                return x * 5;
        }

or even :

        int foo(int x) {
                return (x << 2) + x;
        }

?

If so, I think no such tool is possible in the general case.

There /are/ decompilers that turn assembly into compilable C.  But the
structure and details is often lost - an original "for" loop might end
up as a "goto" loop, for example.  Once you have fed in normal C code
with optimisation, things like inter-procedural optimisations, constant
propagation, inlining, and other re-arrangements will mean that the
regurgitated C code is incomprehensible.  It should be possible to
compile it again and get the same semantics - that's the point of a
decompiler.  And it might be possible for security analysis tools to
gather information about vulnerabilities.  But it's not going to help
you understand how the optimiser is working.

I think the best tool for looking at optimisation of small sections of
code is the <https://godbolt.org> online compiler explorer.  You can put
in C code (or many other languages) and look at the generated assembly -
using an x86 compiler or any other target processor if you prefer.

<https://godbolt.org/z/dG8dd57ej>

I recently learnt of dogbolt.org for disassembling.

Trying the example above on this site, BinaryNinja gives, as suggested:

uint64_t foo(int32_t arg1) __pure
{
     return arg1 * 5;
}


That is nice to see. I will play with dogbolt.org when I get the chance - it's an interesting idea. I am particularly interested in how it deals with code for embedded devices (like Cortex-M Thumb2 code), since that is my most used target.

I think that once you have more advanced optimisations like inter-procedural optimisations, it's going to be very difficult for a decompiler to deduce a good structure. It is not at all uncommon for a single C file to have only two or three externally linked functions and dozens of static functions - the optimiser will then inline most of these. Mix in partial inlining, cold code outlining, cloning, and - for maximum fun - LTO, and I think it will be very difficult to get useful results from decompilation.

However, it could still be a useful tool for small sections of code that are structured appropriately.


David



[Index of Archives]     [Linux C Programming]     [Linux Kernel]     [eCos]     [Fedora Development]     [Fedora Announce]     [Autoconf]     [The DWARVES Debugging Tools]     [Yosemite Campsites]     [Yosemite News]     [Linux GCC]

  Powered by Linux