On Thu, Aug 28, 2014 at 07:40:40PM +0800, Xue jiufei wrote: > Hi all, > We found there may exist a deadlock during direct memory reclaim in > network filesystem. > Here's one example in ocfs2, maybe other network filesystems has > this problems too. > > 1)Receiving a connect message from other nodes, Node queued > o2net_listen_work. > 2)o2net_wq processed this work and try to allocate memory for a > new socket. > 3)Syetem has no more memory, it would do direct memory reclaim > and trigger the inode cleanup. That inode being cleaned up is > happened to be ocfs2 inode, so call evict()->ocfs2_evict_inode() > ->ocfs2_drop_lock()->dlmunlock()->o2net_send_message_vec(), > and wait for the response. > 4)tcp layer received the response, call o2net_data_ready() and > queue sc_rx_work, waiting o2net_wq to process this work. > 5)o2net_wq is a single thread workqueue, it process the work one by > one. Right now is is still doing o2net_listen_work and cannot handle > sc_rx_work. so we deadlock. > > To avoid deadlock like this, caller should perform a GFP_NOFS > allocation attempt(see the comments of shrink_dcache_memory and > shrink_icache_memory). > However, in the situation I described above, it is impossible to > add GFP_NOFS flag unless we modify the socket create interface. > > To fix this deadlock, we would not like to shrink inode and dentry > slab during direct memory reclaim. Kswapd would do this job for us. > So we want to force add __GFP_FS when call > __alloc_pages_direct_reclaim() in __alloc_pages_slowpath(). > Is that OK or any better advice? memalloc_noio_save/memalloc_noio_restore -Dave. -- Dave Chinner david@xxxxxxxxxxxxx -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>