On Fri, Apr 17, 2020 at 05:42:44PM +0300, Kirill A. Shutemov wrote: > On Fri, Apr 17, 2020 at 02:18:28PM +0200, Peter Zijlstra wrote: > Do you mean that it is not aligned to '(' on the previous line? > > Okay, I'll fix it up (and change my vim setup). But I find indenting with > spaces disgusting. set cino=:0(0 > > > static void cpa_flush(struct cpa_data *data, int cache) > > > { > > > + LIST_HEAD(pgtables); > > > + struct page *page, *tmp; > > > > xmas fail > > Emm. What are rules here? > > > > struct cpa_data *cpa = data; > > > unsigned int i; Basically we (tip) strive for the inverse xmas tree thing, so here that would be: struct cpa_data *cpa = data; + struct page *page, *tmp; + LIST_HEAD(pgtables); unsigned int i; > > > + pr_debug("2M restored at %#lx\n", addr); > > > > While I appreciate it's usefulness while writing this, I do think we can > > do without that print once we know it works. > > Even with pr_debug()? I shouldn't be noisy for most users. I always have debug on; also there is no counterpart on split either. > > > +/* > > > + * Restore PMD and PUD pages in the kernel mapping around the address where > > > + * possible. > > > + * > > > + * Caller must flush TLB and free page tables queued on the list before > > > + * touching the new entries. CPU must not see TLB entries of different size > > > + * with different attributes. > > > + */ > > > +static void restore_large_pages(unsigned long addr, struct list_head *pgtables) > > > +{ > > > + pgd_t *pgd; > > > + p4d_t *p4d; > > > + pud_t *pud; > > > + > > > + addr &= PUD_MASK; > > > + > > > + spin_lock(&pgd_lock); > > > + pgd = pgd_offset_k(addr); > > > + if (pgd_none(*pgd)) > > > + goto out; > > > + p4d = p4d_offset(pgd, addr); > > > + if (p4d_none(*p4d)) > > > + goto out; > > > + pud = pud_offset(p4d, addr); > > > + if (!pud_present(*pud) || pud_large(*pud)) > > > + goto out; > > > + > > > + restore_pud_page(pud, addr, pgtables); > > > +out: > > > + spin_unlock(&pgd_lock); > > > +} > > > > I find this odd, at best. AFAICT this does not attempt to reconstruct a > > PMD around @addr when possible. When the first PMD of the PUD can't be > > reconstructed, we give up entirely. > > No, you misread. If the first PMD is not suitable, we give up > reconstructing PUD page, but we still look at all PMD of the PUD range. Aah, indeed. I got my brain in a twist reading that pud function. > But this might be excessive. What you are proposing below should be fine > and more efficient. If we do everything right the rest of PMDs in the PUD > have to already large or not suitable for reconstructing. Just so. > We might still look at the rest of PMDs for CONFIG_CPA_DEBUG, just to make > sure we don't miss some corner case where we didn't restore a PMD. > > (Also I think about s/restore/reconstruct/g) Right, and WARN if they do succeed collapsing ;-)