From: Jann Horn > Sent: 06 February 2020 13:16 ... > > I cannot find evidence for > > what function start alignment should be. > > There is no architecturally required alignment for functions, but > Intel's Optimization Manual > (<https://www.intel.com/content/dam/www/public/us/en/documents/manuals/64-ia-32-architectures- > optimization-manual.pdf>) > recommends in section 3.4.1.5, "Code Alignment": > > | Assembly/Compiler Coding Rule 12. (M impact, H generality) > | All branch targets should be 16-byte aligned. > > AFAIK this is recommended because, as documented in section 2.3.2.1, > "Legacy Decode Pipeline" (describing the frontend of Sandy Bridge, and > used as the base for newer microarchitectures): > > | An instruction fetch is a 16-byte aligned lookup through the ITLB > and into the instruction cache. > | The instruction cache can deliver every cycle 16 bytes to the > instruction pre-decoder. > > AFAIK this means that if a branch ends close to the end of a 16-byte > block, the frontend is less efficient because it may have to run two > instruction fetches before the first instruction can even be decoded. See also The microarchitecture of Intel, AMD and VIA CPUs from www.agner.org/optimize My suspicion is that reducing the cache size (so more code fits in) will almost always be a win over aligning branch targets and entry points. If the alignment of a function matters then there are probably other changes to that bit of code that will give a larger benefit. David - Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK Registration No: 1397386 (Wales)