Daniel Axtens <dja@xxxxxxxxxx> writes: > Hi Mark, > > Thanks for your email - I'm very new to mm stuff and the feedback is > very helpful. > >>> +#ifndef CONFIG_KASAN_VMALLOC >>> int kasan_module_alloc(void *addr, size_t size) >>> { >>> void *ret; >>> @@ -603,6 +604,7 @@ void kasan_free_shadow(const struct vm_struct *vm) >>> if (vm->flags & VM_KASAN) >>> vfree(kasan_mem_to_shadow(vm->addr)); >>> } >>> +#endif >> >> IIUC we can drop MODULE_ALIGN back to PAGE_SIZE in this case, too. > > Yes, done. > >>> core_initcall(kasan_memhotplug_init); >>> #endif >>> + >>> +#ifdef CONFIG_KASAN_VMALLOC >>> +void kasan_cover_vmalloc(unsigned long requested_size, struct vm_struct *area) >> >> Nit: I think it would be more consistent to call this >> kasan_populate_vmalloc(). >> > > Absolutely. I didn't love the name but just didn't 'click' that populate > would be a better verb. > >>> +{ >>> + unsigned long shadow_alloc_start, shadow_alloc_end; >>> + unsigned long addr; >>> + unsigned long backing; >>> + pgd_t *pgdp; >>> + p4d_t *p4dp; >>> + pud_t *pudp; >>> + pmd_t *pmdp; >>> + pte_t *ptep; >>> + pte_t backing_pte; >> >> Nit: I think it would be preferable to use 'page' rather than 'backing', >> and 'pte' rather than 'backing_pte', since there's no otehr namespace to >> collide with here. Otherwise, using 'shadow' rather than 'backing' would >> be consistent with the existing kasan code. > > Not a problem, done. > >>> + addr = shadow_alloc_start; >>> + do { >>> + pgdp = pgd_offset_k(addr); >>> + p4dp = p4d_alloc(&init_mm, pgdp, addr); >>> + pudp = pud_alloc(&init_mm, p4dp, addr); >>> + pmdp = pmd_alloc(&init_mm, pudp, addr); >>> + ptep = pte_alloc_kernel(pmdp, addr); >>> + >>> + /* >>> + * we can validly get here if pte is not none: it means we >>> + * allocated this page earlier to use part of it for another >>> + * allocation >>> + */ >>> + if (pte_none(*ptep)) { >>> + backing = __get_free_page(GFP_KERNEL); >>> + backing_pte = pfn_pte(PFN_DOWN(__pa(backing)), >>> + PAGE_KERNEL); >>> + set_pte_at(&init_mm, addr, ptep, backing_pte); >>> + } >> >> Does anything prevent two threads from racing to allocate the same >> shadow page? >> >> AFAICT it's possible for two threads to get down to the ptep, then both >> see pte_none(*ptep)), then both try to allocate the same page. >> >> I suspect we have to take init_mm::page_table_lock when plumbing this >> in, similarly to __pte_alloc(). > > Good catch. I think you're right, I'll add the lock. > >>> + } while (addr += PAGE_SIZE, addr != shadow_alloc_end); >>> + >>> + kasan_unpoison_shadow(area->addr, requested_size); >>> + requested_size = round_up(requested_size, KASAN_SHADOW_SCALE_SIZE); >>> + kasan_poison_shadow(area->addr + requested_size, >>> + area->size - requested_size, >>> + KASAN_VMALLOC_INVALID); >> >> IIUC, this could leave the final portion of an allocated page >> unpoisoned. >> >> I think it might make more sense to poison each page when it's >> allocated, then plumb it into the page tables, then unpoison the object. >> >> That way, we can rely on any shadow allocated by another thread having >> been initialized to KASAN_VMALLOC_INVALID, and only need mutual >> exclusion when allocating the shadow, rather than when poisoning >> objects. I've come a bit unstuck on this one. If a vmalloc address range is reused, we can end up with the following sequence: - p = vmalloc(PAGE_SIZE) allocates ffffc90000000000 - kasan_populate_shadow allocates a shadow page, fills it with KASAN_VMALLOC_INVALID, and then unpoisions PAGE_SIZE >> KASAN_SHADOW_SHIFT_SIZE bytes - vfree(p) - p = vmalloc(3000) also allocates ffffc90000000000 because of address reuse in vmalloc. - Now kasan_populate_shadow doesn't allocate a page, so does no poisioning. - kasan_populate_shadow unpoisions 3000 >> KASAN_SHADOW_SHIFT_SIZE bytes, but the PAGE_SIZE-3000 extra bytes are still unpoisioned, so accesses that are out-of-bounds for the 3000 byte allocation are missed. So I think we do need to poision the shadow of the [requested_size, area->size) region each time. However, I don't think we need mutual exclusion to be able to do this safely. I think the safety is guaranteed by vmalloc not giving the same page to multiple allocations. Because no two threads are going to get overlapping vmalloc/vmap allocations, their shadow ranges are not going to overlap, and so they're not going to trample over each other. I think it's probably still worth poisioning the pages on allocation: for one thing, you are right that part of the shadow page will not be poisioned otherwise, and secondly it means you migh get a kasan splat before you get a page-not-present fault if you access beyond an allocation, at least if the shadow happens to fall helpfully within an already-allocated page. v3 to come soon. Regards, Daniel > > Yes, that makes sense, will do. > > Thanks again, > Daniel