On 09/29/2015 09:40 PM, Matt Bennett wrote:
During development it was found that a number of builds would panic during the kernel init process, more specifically in 'delayed_fput()'. The panic showed the kernel trying to access a memory address of '0xb7fdc00' while traversing the 'delayed_fput_list' structure. Comparing this memory address to the value of the pointer used on builds that did not panic confirmed that the pointer on crashing builds must have been corrupted at some stage earlier in the init process. By traversing the list earlier and earlier in the code it was found that 'plat_mem_setup()' was responsible for corrupting the list. Specifically the line: memory = cvmx_bootmem_phy_alloc(mem_alloc_size, __pa_symbol(&__init_end), -1, 0x100000, CVMX_BOOTMEM_FLAG_NO_LOCKING); Which would eventually call: cvmx_bootmem_phy_set_size(new_ent_addr, cvmx_bootmem_phy_get_size (ent_addr) - (desired_min_addr - ent_addr)); Where 'new_ent_addr'=0x4800000 (the address of 'delayed_fput_list') and the second argument (size)=0xb7fdc00 (the address causing the kernel panic). The job of this part of 'plat_mem_setup()' is to allocate chunks of memory for the kernel to use. At the start of each chunk of memory the size of the chunk is written, hence the value 0xb7fdc00 is written onto memory at 0x4800000, therefore the kernel panics when it goes back to access 'delayed_fput_list' later on in the initialisation process. On builds that were not crashing it was found that the compiler had placed 'delayed_fput_list' at 0x4800008, meaning it wasn't corrupted (but something else in memory was overwritten). As can be seen in the first function call above the code begins to allocate chunks of memory beginning from the symbol '__init_end'. The MIPS linker script (vmlinux.lds.S) however defines the .bss section to begin after '__init_end'. Therefore memory within the .bss section is allocated to the kernel to use (System.map shows 'delayed_fput_list' and other kernel structures to be in .bss). To stop the kernel panic (and the .bss section being corrupted) memory should begin being allocated from the symbol '_end'. Signed-off-by: Matt Bennett <matt.bennett@xxxxxxxxxxxxxxxxxxx>
This is obviously correct. I wonder how it was able to work for so long without failing for someone.
Acked-by: David Daney <david.daney@xxxxxxxxxx> It is probably suitable for stable as well. David Daney
--- arch/mips/cavium-octeon/setup.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/mips/cavium-octeon/setup.c b/arch/mips/cavium-octeon/setup.c index 89a6284..bd63425 100644 --- a/arch/mips/cavium-octeon/setup.c +++ b/arch/mips/cavium-octeon/setup.c @@ -933,7 +933,7 @@ void __init plat_mem_setup(void) while ((boot_mem_map.nr_map < BOOT_MEM_MAP_MAX) && (total < MAX_MEMORY)) { memory = cvmx_bootmem_phy_alloc(mem_alloc_size, - __pa_symbol(&__init_end), -1, + __pa_symbol(&_end), -1, 0x100000, CVMX_BOOTMEM_FLAG_NO_LOCKING); if (memory >= 0) {