> Subject: Re: [PATCH 1/2] remoteproc: drop memset when loading elf > segments > > On 4/13/20 4:05 AM, Peng Fan wrote: > > > >> Subject: Re: [PATCH 1/2] remoteproc: drop memset when loading elf > >> segments > >> > >> On Thu 09 Apr 18:29 PDT 2020, Peng Fan wrote: > >> > >>> Hi Bjorn, > >>> > >>>> Subject: Re: [PATCH 1/2] remoteproc: drop memset when loading elf > >>>> segments > >>>> > >>>> On Thu 09 Apr 01:22 PDT 2020, Peng Fan wrote: > >>>> > >>>>> To arm64, "dc zva, dst" is used in memset. > >>>>> Per ARM DDI 0487A.j, chapter C5.3.8 DC ZVA, Data Cache Zero by VA, > >>>>> > >>>>> "If the memory region being zeroed is any type of Device memory, > >>>>> this instruction can give an alignment fault which is prioritized > >>>>> in the same way as other alignment faults that are determined by > >>>>> the memory type." > >>>>> > >>>>> On i.MX platforms, when elf is loaded to onchip TCM area, the > >>>>> region is ioremapped, so "dc zva, dst" will trigger abort. > >>>>> > >>>>> Since memset is not strictly required, let's drop it. > >>>>> > >>>> > >>>> This would imply that we trust that the firmware doesn't expect > >>>> remoteproc to zero out the memory, which we've always done. So I > >>>> don't think we can say that it's not required. > >>> > >>> Saying an image runs on a remote core needs Linux to help zero out > >>> BSS section, this not make sense to me. > >> > >> Maybe not, but it has always done it, so if there's firmware out > >> there that depends on it such change would break them.. > > > > Got it. > > > >> > >>> My case is as following, I need to load section 7 data. > >>> I no need to let remoteproc to memset section 8/9/10/11/12, the > >>> firmware itself could handle that. Just because the memsz is larger > >>> than filesz, remoreproc must memset? > >> > >> By having a PT_LOAD segment covering these I think it's reasonable to > >> assume that the ELF loader should be able to touch the associated > memory. > >> > >>> Section Headers: > >>> [Nr] Name Type Addr Off > Size > >> ES Flg Lk Inf Al > >>> [ 0] NULL 00000000 000000 > >> 000000 00 0 0 0 > >>> [ 1] .interrupts PROGBITS 1ffe0000 010000 000240 > 00 > >> A 0 0 4 > >>> [ 2] .resource_table PROGBITS 1ffe0240 010240 000058 > 00 > >> A 0 0 1 > >>> [ 3] .text PROGBITS 1ffe02a0 0102a0 > 009ccc 00 > >> AX 0 0 16 > >>> [ 4] .ARM ARM_EXIDX 1ffe9f6c 019f6c > 000008 > >> 00 AL 3 0 4 > >>> [ 5] .init_array INIT_ARRAY 1ffe9f74 019f74 000004 > 04 > >> WA 0 0 4 > >>> [ 6] .fini_array FINI_ARRAY 1ffe9f78 019f78 000004 > 04 > >> WA 0 0 4 > >>> [ 7] .data PROGBITS 1fff9240 029240 > 000084 > >> 00 WA 0 0 4 > >>> [ 8] .ncache.init PROGBITS 1fff92c4 0292c4 000000 > 00 > >> W 0 0 1 > >>> [ 9] .ncache NOBITS 1fff92c4 0292c4 > 000a80 > >> 00 WA 0 0 4 > >>> [10] .bss NOBITS 1fff9d44 0292c4 > 01f5c0 > >> 00 WA 0 0 4 > >>> [11] .heap NOBITS 20019304 0292c4 > 000404 > >> 00 WA 0 0 1 > >>> [12] .stack NOBITS 20019708 0292c4 > 000400 > >> 00 WA 0 0 1 > >>> > >>>> > >>>>> Signed-off-by: Peng Fan <peng.fan@xxxxxxx> > >>>>> --- > >>>>> drivers/remoteproc/remoteproc_elf_loader.c | 7 ++----- > >>>>> 1 file changed, 2 insertions(+), 5 deletions(-) > >>>>> > >>>>> diff --git a/drivers/remoteproc/remoteproc_elf_loader.c > >>>>> b/drivers/remoteproc/remoteproc_elf_loader.c > >>>>> index 16e2c496fd45..cc50fe70d50c 100644 > >>>>> --- a/drivers/remoteproc/remoteproc_elf_loader.c > >>>>> +++ b/drivers/remoteproc/remoteproc_elf_loader.c > >>>>> @@ -238,14 +238,11 @@ int rproc_elf_load_segments(struct rproc > >>>>> *rproc, > >>>> const struct firmware *fw) > >>>>> memcpy(ptr, elf_data + offset, filesz); > >>>>> > >>>>> /* > >>>>> - * Zero out remaining memory for this segment. > >>>>> + * No need zero out remaining memory for this segment. > >>>>> * > >>>>> * This isn't strictly required since dma_alloc_coherent > already > >>>>> - * did this for us. albeit harmless, we may consider removing > >>>>> - * this. > >>>>> + * did this for us. > >>>> > >>>> In the case of recovery this comment is wrong, we do not > >>>> dma_alloc_coherent() the carveout during a recovery. > >>> > >>> Isn't the it the firmware's job to memset the region? > >>> > >> > >> I'm not aware of this being a documented requirement, we've always > >> done it here for them, so removing this call would be a change in behavior. > >> > >>>> > >>>> And in your case you ioremapped existing TCM, so it's never true. > >>>> > >>>>> */ > >>>>> - if (memsz > filesz) > >>>>> - memset(ptr + filesz, 0, memsz - filesz); > >>>> > >>>> So I think you do want to zero out this region. Question is how we do it... > >>> > >>> I have contacted our M4 owners, we no need clear it from Linux side. > >> > >> And I think _most_ firmware out there, like yours, does clear BSS etc > >> during initialization. > >> > >>> We also support booting m4 before booting Linux, at that case, Linux > >>> has noting to do with memset. It is just I try loading m4 image with > >>> Linux, and met the issue that memset trigger abort. > >>> > >> > >> Please see the proposal for attaching to already running remoteproc's > >> from Mathieu. I don't expect that you want to load your PROGBITS > >> either in this case? > > > > No need to load for early boot case. It is just userspace load trigger > > kernel panic, because memset arm64 could not work for ioremapped > memory. > > > > How about adding ops hooks for memset and memcpy to let driver have > > their own implementation? > > Hi Peng, > > The trick is to use the ioremap_wc() variant instead of ioremap() in your > platform driver while mapping the TCMs. I know multiple folks have run into > this issue. This is what most of the remoteproc drivers use, and mmio-sram > driver also uses the same. TCM is different from OCRAM in i.MX chips. ioremap_wc not work for TCM memory, I just tried that. I am thinking we change memset/memcpy to use memset_io/memcpy_fromio/memcpy_toio for remoteproc common code, Thanks, Peng. > > regards > Suman