> On Mon, Jun 30, 2014 at 08:16:23PM +0200, Joerg Roedel wrote: > > On Mon, Jun 30, 2014 at 12:06:05PM -0400, Jerome Glisse wrote: > > > No this patch does not duplicate it. Current user of mmu_notifier > > > rely on file close code path to call mmu_notifier_unregister. New > > > code like AMD IOMMUv2 or HMM can not rely on that. Thus they need a > > > way to call the mmu_notifer_unregister (which can not be done from > > > inside the the release call back). > > > > No, when the mm is destroyed the .release function is called from > > exit_mmap() which calls mmu_notifier_release() right at the beginning. > > In this case you don't need to call mmu_notifer_unregister yourself > > (you can still call it, but it will be a nop). > > > > We do intend to tear down all secondary mapping inside the relase callback but > still we can not cleanup all the resources associated with it. > > > > If you look at current code the release callback is use to kill > > > secondary translation but not to free associated resources. All the > > > associated resources are free later on after the release callback > > > (well it depends if the file is close before the process is kill). > > > > In exit_mmap the .release function is called when all mappings are > > still present. Thats the perfect point in time to unbind all those > > resources from your device so that it can not use it anymore when the > > mappings get destroyed. > > > > > So this patch aims to provide a callback to code outside of the > > > mmu_notifier realms, a place where it is safe to cleanup the > > > mmu_notifier and associated resources. > > > > Still, this is a duplication of mmu_notifier release call-back, so > > still NACK. > > > > It is not, mmu_notifier_register take increase mm_count and only > mmu_notifier_unregister decrease the mm_count which is different from the > mm_users count (the latter being the one that trigger an mmu notifier > release). > > As said from the release call back you can not call mmu_notifier_unregister > and thus you can not fully cleanup things. Only way to achieve so is to do it > ouside mmu_notifier callback. As pointed out current user do not have this > issue because they rely on file close callback to perform the cleanup operation. > New user will not necessarily have such things to rely on. Hence factorizing > various mm_struct destruction callback with an callback chain. > > If you know any other way to call mmu_notifier_unregister before the end of > mmput function than i am all ear. I am not adding this call back just for the fun > of it i spend serious time trying to find a way to do thing without it. I might have > miss a way so if i did please show it to me. > Joerg, please consider that the amd_iommu_v2 driver already breaks the rules for what can be done from the release callback. In particular, it frees the pasid_state structure containing the struct mmu_notifier. (mn_release, unbind_pasid, put_pasid_state_wait, free_pasid_state). Since this contains the next pointer for the mmu_notifier list, __mmu_notifier_release will crash. Modifying the MMU notifier list isn't allowed because the notifier code is holding an RCU read lock. In general the problem is that RCU read locks are very constraining and things that you'd like to do from release can't be done. It could be done from the mmput callback, or perhaps mmu_notifier_release could call release from call_srcu instead. As an aside we found another small issue: amd_iommu_bind_pasid calls get_task_mm. This bumps the mm_struct use count and it will never be released. This would prevent the buggy code path described above from ever running in the first place. Thanks. Andrew -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href