I recently discovered that commit 662bbcb2747c2422cf98d3d97619509379eee466 removed/skipped all might_sleep checks for might_fault() calls when in atomic. Reason was to allow calling copy_(to|from)_user while in pagefault_disabled(), because otherwise, CONFIG_DEBUG_ATOMIC_SLEEP would find false positives. However we have the inatomic variants of these function for this purpose. The result of this change was that all guest access (that correctly uses might_fault()) doesn't perform atomic checks when CONFIG_DEBUG_ATOMIC_SLEEP is enabled. We lost a mighty debugging feature for user access. As I wasn't able to come up with any other reason why this should be necessary, I suggest turning the might_sleep() checks on again in the might_fault() code. pagefault_disable/pagefault_enable seems to be used mainly for futex, perf event and kmap. Going over all changes since that commit, it seems like most code already uses the inatomic variants of copy_(to|from)_user. Code relying on (get|put)_user during pagefault_disable() don't make use of any might_fault() in their (get|put)_user implementation. Examples: - arch/m68k/include/asm/futex.h - arch/parisc/include/asm/futex.h - arch/sh/include/asm/futex-irq.h - arch/tile/include/asm/futex.h So changing might_fault() back to trigger might_sleep() won't change a thing for them. Hope I haven't missed anything. I only identified one might_fault() that would get triggered by get_user() on powerpc and fixed it by using the inatomic variant (not tested). I am not sure if we need some non-sleeping access_ok() prior to __get_user_inatomic(). By looking at the code I was wondering where the correct place for might_fault() calls is? Doesn't seem to be very consistent. Examples: | asm-generic | arm | arm64 | frv | m32r | x86 and s390 --------------------------------------------------------------------------- get_user() | Yes | Yes | Yes | No | Yes | Yes __get_user() | No | Yes | No | No | Yes | No put_user() | Yes | Yes | Yes | No | Yes | Yes __put_user() | No | Yes | No | No | Yes | No copy_to_user() | Yes | No | No | Yes | Yes | Yes __copy_to_user() | No | No | No | Yes | No | No copy_from_user() | Yes | No | No | Yes | Yes | Yes __copy_from_user() | No | No | No | Yes | No | No So I would have assume that the way asm-generic, x86 and s390 (+ propably others) do this is the right way? So we can speed up multiple calls to e.g. copy_to_user() by doing the access check manually (and also the might_fault() if relevant), then calling __copy_to_user(). So in general, I conclude that the concept is: 1. __.* variants perform no checking and don't call might_fault() 2. [a-z].* variants perform access checking (access_ok() if implemented) 3. [a-z].* variants call might_fault() 4. .*_inatomic variants usually don't perform access checks 5. .*_inatomic variants don't call might_fault() 6. If common code uses the __.* variants, it has to trigger access_ok() and call might_fault() 7. For pagefault_disable(), the inatomic variants are to be used Comments? Opinions? David Hildenbrand (2): powerpc/fsl-pci: atomic get_user when pagefault_disabled mm, sched: trigger might_sleep() in might_fault() when atomic arch/powerpc/sysdev/fsl_pci.c | 2 +- include/linux/kernel.h | 8 ++++++-- mm/memory.c | 11 ++++------- 3 files changed, 11 insertions(+), 10 deletions(-) -- 1.8.5.5 -- To unsubscribe from this list: send the line "unsubscribe linux-arch" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html