(cc to linux-fsdevel) 2009/12/7 Andi Kleen <andi@xxxxxxxxxxxxxx>: > On Mon, Dec 07, 2009 at 09:19:28PM +0900, KOSAKI Motohiro wrote: >> > >> > While booting 2.6.32-git1 on a NFS root box I got the following >> > lockdep warning early at boot. I haven't looked at details. >> >> It seems typical ABBA deadlock. >> >> vfs_readdir [grab i_mutex] >> nfs_readdir >> nfs_do_filldir >> filldir >> copy_to_user >> [page_fault] [grab mmap_sem] >> >> sys_mmap [grab mmap_sem] >> do_mmap_pgoff >> mmap_region >> nfs_file_mmap >> nfs_revalidate_mapping >> nfs_invalidate_mapping [grab i_mutex] >> >> I guess recent lockdep improvement find old bug. > > Thanks for the analysis. > > I guess should never do copy_*_user while holding i_mutex? There might > be lots of cases like that. > > -Andi I'm not sure exactly vfs rule. but at least mm/rmap.c explained collect order is i_mutex -> mmap_sem rmap.c --------------------------------------------------------------------- * Lock ordering in mm: * * inode->i_mutex (while writing or truncating, not reading or faulting) * inode->i_alloc_sem (vmtruncate_range) * mm->mmap_sem * page->flags PG_locked (lock_page) * mapping->i_mmap_lock * anon_vma->lock * mm->page_table_lock or pte_lock * zone->lru_lock (in mark_page_accessed, isolate_lru_page) * swap_lock (in swap_duplicate, swap_info_get) * mmlist_lock (in mmput, drain_mmlist and others) * mapping->private_lock (in __set_page_dirty_buffers) * inode_lock (in set_page_dirty's __mark_inode_dirty) * sb_lock (within inode_lock in fs/fs-writeback.c) * mapping->tree_lock (widely used, in set_page_dirty, * in arch-dependent flush_dcache_mmap_lock, * within inode_lock in __sync_single_inode) ------------------------------------------------------------------------------------------------- Plus, ext4 have following comment. it imply nfs mmap implementaion is wrong... -------------------------------------------------------------------------------------- int ext4_page_mkwrite(struct vm_area_struct *vma, struct vm_fault *vmf) { struct page *page = vmf->page; loff_t size; unsigned long len; int ret = -EINVAL; void *fsdata; struct file *file = vma->vm_file; struct inode *inode = file->f_path.dentry->d_inode; struct address_space *mapping = inode->i_mapping; /* * Get i_alloc_sem to stop truncates messing with the inode. We cannot * get i_mutex because we are already holding mmap_sem. */ -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html