Re: NFS lockdep lock misordering mmap_sem<->i_mutex_key with 2.6.32-git1

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



(cc to linux-fsdevel)

2009/12/7 Andi Kleen <andi@xxxxxxxxxxxxxx>:
> On Mon, Dec 07, 2009 at 09:19:28PM +0900, KOSAKI Motohiro wrote:
>> >
>> > While booting 2.6.32-git1 on a NFS root box I got the following
>> > lockdep warning early at boot. I haven't looked at details.
>>
>> It seems typical ABBA deadlock.
>>
>>  vfs_readdir                          [grab i_mutex]
>>    nfs_readdir
>>      nfs_do_filldir
>>        filldir
>>          copy_to_user
>>            [page_fault]                       [grab mmap_sem]
>>
>>  sys_mmap                             [grab mmap_sem]
>>    do_mmap_pgoff
>>      mmap_region
>>        nfs_file_mmap
>>          nfs_revalidate_mapping
>>            nfs_invalidate_mapping     [grab i_mutex]
>>
>> I guess recent lockdep improvement find old bug.
>
> Thanks for the analysis.
>
> I guess should never do copy_*_user while holding i_mutex? There might
> be lots of cases like that.
>
> -Andi

I'm not sure exactly vfs rule. but at least mm/rmap.c explained
collect order is i_mutex -> mmap_sem

rmap.c
---------------------------------------------------------------------
 * Lock ordering in mm:
 *
 * inode->i_mutex       (while writing or truncating, not reading or faulting)
 *   inode->i_alloc_sem (vmtruncate_range)
 *   mm->mmap_sem
 *     page->flags PG_locked (lock_page)
 *       mapping->i_mmap_lock
 *         anon_vma->lock
 *           mm->page_table_lock or pte_lock
 *             zone->lru_lock (in mark_page_accessed, isolate_lru_page)
 *             swap_lock (in swap_duplicate, swap_info_get)
 *               mmlist_lock (in mmput, drain_mmlist and others)
 *               mapping->private_lock (in __set_page_dirty_buffers)
 *               inode_lock (in set_page_dirty's __mark_inode_dirty)
 *                 sb_lock (within inode_lock in fs/fs-writeback.c)
 *                 mapping->tree_lock (widely used, in set_page_dirty,
 *                           in arch-dependent flush_dcache_mmap_lock,
 *                           within inode_lock in __sync_single_inode)
-------------------------------------------------------------------------------------------------


Plus, ext4 have following comment. it imply nfs mmap implementaion is wrong...

--------------------------------------------------------------------------------------
int ext4_page_mkwrite(struct vm_area_struct *vma, struct vm_fault *vmf)
{
        struct page *page = vmf->page;
        loff_t size;
        unsigned long len;
        int ret = -EINVAL;
        void *fsdata;
        struct file *file = vma->vm_file;
        struct inode *inode = file->f_path.dentry->d_inode;
        struct address_space *mapping = inode->i_mapping;

        /*
         * Get i_alloc_sem to stop truncates messing with the inode. We cannot
         * get i_mutex because we are already holding mmap_sem.
         */
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [Samba]     [Device Mapper]     [CEPH Development]
  Powered by Linux