Hi, I'm working on Android Linux Kernel (version 3.4) and seeing a "deadlock" in the ashmem driver, while handling mmap request. I seek your support to fix the same. The locks that involved in the dead lock are 1) mm->mmap_sem 2) ashmem_mutex The following is the sequence of events that leads to the deadlock. There are two threads A and B that belong to the same process (system_server) and hence share the mm struct. A1) In the A's context an mmap system call is made with an fd of ashmem A2) The system call sys_mmap_pgoff acquires the mmap_sem of the "mm" and sleeps before calling the .mmap of ashmem i.e before calling ashmem_mmap and acquiring the ashmem_mutex from this function. Now the thread B runs and proceeds to do the following B1) In the B's context ashmem ioctl with option ASHMEM_GET_NAME is called. B2) Now the code proceeds to acquire the ashmem_mutex and performs a "copy_to_user" B3) copy_to_user raises a valid exception to copy the data from user space and proceeds to handle it gracefully, do_DataAbort --> do_page_fault (This condition can happen if the physical page for the give user address is not mapped yet). B4) In do_page_fault it finds that the mm->mmap_sem is not available (Note A & B share the mm) since A has it and sleeps Now the thread A runs again A3) It proceeds to call ashmem_mmap and tries to acquired ashmem_mutex, which is not available (is with B) and sleeps. Now A has acquired mmap_sem and waits for B to release ashmem_mutex B has acquired ashmem_mutex and waits for the mmap_sem to be available, which is held by A. This creates a dead lock in the system. Proposed Solution: Do not hold the ashmem lock while making copy_to_user/copy_from_user calls. At the same ensure that data integrity is maintained (two threads calling SET_NAME/GET_NAME should not lead to issues). I have attached a patch to fix this problem. How to Reproduce the problem: In normal circumstances it is very hard to see this. The problem was seen during regression tests, but was very rare. My Kernel Version is 3.4.0 and I managed to write a unit test case to create this issue almost 100% of the time. Testing: Used the same unit test case to ensure that the attached patch fixes the deadlock in Kernel Version 3.4 Ported the same patch to 3.8 for submission. Please provide your feedback on the patch. Warm Regards, Shankar
Attachment:
0001-staging-android-ashmem-get_name-set_name-not-to-hold.patch
Description: Binary data
_______________________________________________ devel mailing list devel@xxxxxxxxxxxxxxxxxxxxxx http://driverdev.linuxdriverproject.org/mailman/listinfo/devel