Hi Catalin, On 2020/7/11 2:31, Catalin Marinas wrote: > On Fri, Jul 10, 2020 at 05:44:20PM +0800, Zhenyu Ye wrote: >> - if ((end - start) >= (MAX_TLBI_OPS * stride)) { >> + if ((!cpus_have_const_cap(ARM64_HAS_TLBI_RANGE) && >> + (end - start) >= (MAX_TLBI_OPS * stride)) || >> + pages >= MAX_TLBI_RANGE_PAGES) { >> flush_tlb_mm(vma->vm_mm); >> return; >> } > > I think we can use strictly greater here rather than greater or equal. > MAX_TLBI_RANGE_PAGES can be encoded as num 31, scale 3. Sorry, we can't. For a boundary value (such as 2^6), we have two way to express it in TLBI RANGE operations: 1. scale = 0, num = 31. 2. scale = 1, num = 0. I used the second way in following implementation. However, for the MAX_TLBI_RANGE_PAGES, we can only use scale = 3, num = 31. So if use strictly greater here, ERROR will happen when range pages equal to MAX_TLBI_RANGE_PAGES. There are two ways to avoid this bug: 1. Just keep 'greater or equal' here. The ARM64 specification does not specify how we flush tlb entries in this case, flush_tlb_mm() is also a good choice for such a wide range of pages. 2. Add check in the loop, just like: (this may cause the codes a bit ugly) num = __TLBI_RANGE_NUM(pages, scale) - 1; /* scale = 4, num = 0 is equal to scale = 3, num = 31. */ if (scale == 4 && num == 0) { scale = 3; num = 31; } if (num >= 0) { ... Which one do you prefer and how do you want to fix this error? Just a fix patch again? > >> >> - /* Convert the stride into units of 4k */ >> - stride >>= 12; >> + dsb(ishst); >> >> - start = __TLBI_VADDR(start, asid); >> - end = __TLBI_VADDR(end, asid); >> + /* >> + * When cpu does not support TLBI RANGE feature, we flush the tlb >> + * entries one by one at the granularity of 'stride'. >> + * When cpu supports the TLBI RANGE feature, then: >> + * 1. If pages is odd, flush the first page through non-RANGE >> + * instruction; >> + * 2. For remaining pages: The minimum range granularity is decided >> + * by 'scale', so we can not flush all pages by one instruction >> + * in some cases. >> + * Here, we start from scale = 0, flush corresponding pages >> + * (from 2^(5*scale + 1) to 2^(5*(scale + 1) + 1)), and increase >> + * it until no pages left. >> + */ >> + while (pages > 0) { > > I did some simple checks on ((end - start) % stride) and never > triggered. I had a slight worry that pages could become negative (and > we'd loop forever since it's unsigned long) for some mismatched stride > and flush size. It doesn't seem like. > The start and end are round_down/up in the function: start = round_down(start, stride); end = round_up(end, stride); So the flush size and stride will never mismatch. Thanks, Zhenyu