* Ingo Molnar <mingo@xxxxxxxxxx> wrote: > [ > task_numa_work() performance side note: > > We are also *very* close to be able to use down_read() instead > of down_write() in the sampling-unmap code in > task_numa_work(), as it should be safe in theory to call > change_protection(PROT_NONE) in parallel - but there's one > regression that disagrees with this theory so we use > down_write() at the moment. > > Maybe you could help us there: can you see a reason why the > change_prot_none()->change_protection() call in > task_numa_work() can not occur in parallel to a page fault in > another thread on another CPU? It should be safe - yet if we > change it I can see occasional corruption of user-space state: > segfaults and register corruption. > ] Oh, just found the reason: the ptep_modify_prot_start()/modify()/commit() sequence is SMP-unsafe - it has to be done under the mmap_sem write-locked. It is safe against *hardware* updates to the PTE, but not safe against itself. This is apparently a hidden cost of paravirt, it is forcing that weird sequence and thus the down_write() ... Thanks, Ingo -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>