On Mon, Feb 6, 2023, at 14:13, Jianmin Lv wrote: > On 2023/2/6 下午7:18, Xi Ruoyao wrote: >> On Mon, 2023-02-06 at 18:24 +0800, Jianmin Lv wrote: >>> Hi, Xuerui >>> >>> I think the kernels produced with and without -mstrict-align have mainly >>> following differences: >>> - Diffirent size. I build two kernls (vmlinux), size of kernel with >>> -mstrict-align is 26533376 bytes and size of kernel without >>> -mstrict-align is 26123280 bytes. >>> - Diffirent performance. For example, in kernel function jhash(), the >>> assemble code slices with and without -mstrict-align are following: >> >> But there are still questions remaining: >> >> (1) Is the difference contributed by a bad code generation of GCC? If >> true, it's better to improve GCC before someone starts to build a distro >> for LA264 as it would benefit the user space as well. >> > AFAIK, GCC builds to produce unaligned-access-enabled target binary by > default (without -mstrict-align) for improving user space performance > (small size and runtime high performance), which is also based the fact > that the vast majority of LoongArch CPUs support unaligned-access. > >> (2) Is there some "big bad unaligned access loop" on a hot spot in the >> kernel code? If true, it may be better to just refactor the C code >> because doing so will benefit all ports, not only LoongArch. Otherwise, >> it may be unworthy to optimize for some cold paths. >> > Frankly, I'm not sure if there is this kind of hot code in kernel, I > just see the difference from different kernel size and different > assemble code slice. And I'm afraid that it may be difficult to judge > whether it is reasonable hot code or not if exists. Just look for CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS, this will show you code locations that use different implementations based on whether the kernel should run on CPUs without unaligned access or not. Arnd