On Fri, Jan 18, 2008 at 08:41:22AM -0800, Linus Torvalds wrote: > > > On Fri, 18 Jan 2008, npiggin@xxxxxxx wrote: > > */ > > +#ifdef __HAVE_ARCH_PTE_SPECIAL > > +# define HAVE_PTE_SPECIAL 1 > > +#else > > +# define HAVE_PTE_SPECIAL 0 > > +#endif > > struct page *vm_normal_page(struct vm_area_struct *vma, unsigned long addr, pte_t pte) > > { > > - unsigned long pfn = pte_pfn(pte); > > + unsigned long pfn; > > + > > + if (HAVE_PTE_SPECIAL) { > > I really don't think this is *any* different from "#ifdefs in code". > > #ifdef's in code is not about syntax, it's about abstraction. This is > still the exact same thing as having an #ifdef around it, and in many ways > it is *worse*, because now it's just made to look somewhat different with > a particularly ugly #ifdef. > > IOW, this didn't abstract the issue away, it just massaged it to look > different. Yes, the if () is just to please Andrew, not you ;) I thought in your last mail on the subject, that you had conceded the vma-based scheme should stay, so I might have misunderstood that to mean you would, reluctantly, go with the scheme. I guess I need to try a bit harder ;) > I suspect that the nicest abstraction would be to simply make the whole > function be a per-architecture thing. Not exposing a "pte_special()" bit > at all, but instead having the interface simply be: > > - create special entries: > pte_t pte_mkspecial(pte_t pte) > > - check if an entry is special: > struct page *vm_normal_page(vma, addr, pte) > > and now it's not while the naming is a bit odd (for historical reasons), > at least it is properly *abstracted* and you don't have any #ifdef's in > code (and we'd probably need to extend that abstraction then for the > "locklessly look up page" case eventually). Now I would have done this in a flash, except the existing vm_normal_page code is quite a lot, and complex, to duplicate in every architecture. > [ To make it slightly more regular, we could make "pte_mkspecial()" take > the vma/addr thing too, even though it would never really use it except > to perhaps have a VM_BUG_ON() that it only happens within XIP/PFNMAP > vma's. > > The "pte_mkspecial()" definitely has more to to with "vm_normal_page()" > than with the other "pte_mkxyzzy()" functions, so it really might make > sense to instead make the thing > > void set_special_page(vma, addr, pte_t *, pfn, pgprot) > > because it is never acceptable to do "pte_mkspecial()" on any existent > PTE *anyway*, so we might as well make the interface reflect that: it's > not that you make a pte "special", it's that you insert a special page > into the VM. > > So the operation really conceptually has more to do with "set_pte()" > than with "pte_mkxxx()", no? ] Possibly, although I think going that far is hiding things from mm/ a bit much. If you have a look at the places that call pte_mkspecial, it isn't too much I think... > Then, just have a library version of the long form, and make architectures > that don't support it just use that (just so that you don't have to > duplicate that silly thing). So an architecture that support special page > flags would do somethiing like > > #define set_special_page(vma,addr,ptep,pfn,prot) \ > set_pte_at(vma, addr, ptep, mk_special_pte(pfn,prot)) > #define vm_normal_page(vma,addr,pte) > (pte_special(pte) ? NULL : pte_page(pte)) > > and other architectures would just do > > #define set_special_page(vma,addr,ptep,pfn,prot) \ > set_pte_at(vma, addr, ptep, mk_pte(pfn,prot)) > #define vm_normal_page(vma,addr,pte) \ > generic_vm_normal_page(vma,addr,pte) > > or something. > > THAT is what I mean by "no #ifdef's in code" - that the selection is done > at a higher level, the same way we have good interfaces with clear > *conceptual* meaning for all the other PTE accessing stuff, rather than > have conditionals in the architecture-independent code. OK, that gets around the "duplicate vm_normal_page everywhere" issue I had. I'm still not quite happy with it ;) How about taking a different approach. How about also having a pte_normal() function. Each architecture that has a pte special bit would make this !pte_special, and those that don't would return 0. They return 0 from both pte_special and pte_normal because they don't know whether the pte is special or normal. Then vm_normal_page would become: if (pte_special(pte)) return NULL; else if (pte_normal(pte)) return pte_page(pte); ... /* vma based scheme */ - To unsubscribe from this list: send the line "unsubscribe linux-arch" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html