On Sat, 2010-03-06 at 16:17 +0530, James Bottomley wrote: > On a fault in of exec data, we first try to get the page out of the page > cache. If it's not present, we put the faulting process to sleep and > fetch it in from storage. When we do the read, on the PIO path, the > kernel alias for the page becomes dirty. Some time later, we place the > page into the user space (updating the pte entry that caused a fault). > At this point, we'll call both flush_icache_page() and > update_mmu_cache() ... this is where the I/D resolution should be done. > Since it's after any I/O has occurred, it doesn't matter whether the CPU > speculatively moved anything in or not. As long as you flush the kernel > alias and invalidate the user I and D aliases, we're good to go. Using > the page arch flags is really only to optimise this process (defer > kernel D alias flushing). Ok, so while flush_icache_page() looks like something we could use instead of set_pte_at() for the icache flushing, it doesn't answer all the questions. Off the top of my mind: - I see the calls to flush_icache_page() in mm/memory.c but I don't see them next to all set_pte_at() that insert a valid PTE. For example, we don't flush the icache for anonymous pages. While that might seem like a good idea, we have been under pressure to "fix" that on powerpc to make sure there is no stale icache content from another process leaking into userspace. - It needs to be done -before- set_pte_at() but I think the code does it right, only your explanation above makes it unclear :-) - It doesn't take the PTE pointer as an argument, so here goes our trick on powerpc of filtering out exec permission rather than flushing when a page is accessed by a read fault - We -still- have the problem of tracking whether the icache has been flushed or not yet for a given physical page on archs with PIPT (or non aliasing VIPT) like powerpc. Without that tracking, we flush a lot more than necessary since we'll end up flushing things like glibc text pages for every process they are mapped into which is totally wasteful. Thus the idea of using a new PG bit to separate D$ from I$ tracking still makes sense. Cheers, Ben. -- To unsubscribe from this list: send the line "unsubscribe linux-usb" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html