On 2021/2/1 16:31, Arnd Bergmann wrote: > On Mon, Feb 1, 2021 at 4:36 AM Zhen Lei <thunder.leizhen@xxxxxxxxxx> wrote: >> >> Add support for the Hisilicon Kunpeng L3 cache controller as used with >> Kunpeng506 and Kunpeng509 SoCs. >> >> These Hisilicon SoCs support LPAE, so the physical addresses is wider than >> 32-bits, but the actual bit width does not exceed 36 bits. When the cache >> operation is performed based on the address range, the upper 30 bits of >> the physical address are recorded in registers L3_MAINT_START and >> L3_MAINT_END, and ignore the lower 6 bits cacheline offset. >> >> Signed-off-by: Zhen Lei <thunder.leizhen@xxxxxxxxxx> > > Reviewed-by: Arnd Bergmann <arnd@xxxxxxxx> > > If you add one more thing: > >> +static void l3cache_maint_common(u32 range, u32 op_type) >> +{ >> + u32 reg; >> + >> + reg = readl_relaxed(l3_ctrl_base + L3_MAINT_CTRL); >> + reg &= ~(L3_MAINT_RANGE_MASK | L3_MAINT_TYPE_MASK); >> + reg |= range | op_type; >> + reg |= L3_MAINT_STATUS_START; >> + writel(reg, l3_ctrl_base + L3_MAINT_CTRL); >> + >> + /* Wait until the hardware maintenance operation is complete. */ >> + do { >> + cpu_relax(); >> + reg = readl(l3_ctrl_base + L3_MAINT_CTRL); >> + } while ((reg & L3_MAINT_STATUS_MASK) != L3_MAINT_STATUS_END); >> +} >> + >> +static void l3cache_maint_range(phys_addr_t start, phys_addr_t end, u32 op_type) >> +{ >> + start = start >> L3_CACHE_LINE_SHITF; >> + end = ((end - 1) >> L3_CACHE_LINE_SHITF) + 1; >> + >> + writel_relaxed(start, l3_ctrl_base + L3_MAINT_START); >> + writel_relaxed(end, l3_ctrl_base + L3_MAINT_END); >> + >> + l3cache_maint_common(L3_MAINT_RANGE_ADDR, op_type); >> +} > > As mentioned, I'd like to see a code comment that explains the use > the of relaxed() vs non-relaxed MMIO accessors, as it will be impossible > for a reader to later understand why you picked a mix of the two, > and it also ensures that you have considered which one is the best > option to use here and that your explanation matches what you do. OK, I'll test the performance and add the comment. > > Based on Russell's comments, I had expected that you would use > only relaxed accessors, plus explicit barriers if you change it, matching > what l2x0 does (l2x0 has to do it because of __l2c210_cache_sync(), > while you don't have a sync callback and don't need to). I might have been a little conservative, I'll change all of them to _relaxed and then test it. Thanks. > > Arnd > > . >