Deadlock with NFS client and glusterfs server on the same host

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Glusterfs deadlocks when a volume is mounted over NFS on the same host 
where glusterfsd is running. I would like to know if this configuration 
is supported, because I know that some network filesystems can deadlock 
in this configuration.

Deadlock happens when writing a file big enough to fill the filesystem 
cache and kernel is trying to flush it to free some memory for 
glusterfsd which needs memory to commit some filesystem blocks to free 
some memory for glusterfsd...

I'm testing glusterfs-3.1.1 on Fedora 14 with kernel 
2.6.35.10-74.fc14.x86_64.


# ps xaww -Owchan:22
   PID WCHAN                  S TTY          TIME COMMAND
  1603 nfs_wait_bit_killable  D ?        00:00:07 /usr/sbin/glusterfs -f 
/etc/glusterd/nfs/nfs-server.vol -p /etc/glusterd/nfs/run/nfs.pid -l 
/var/log/glusterfs/nfs.log
  1655 nfs_wait_bit_killable  D pts/1    00:00:00 dd if=/dev/zero 
of=/home/gluster/bigfile bs=1M count=100

# cat /proc/1603/stack
[<ffffffffa0100a6c>] nfs_wait_bit_killable+0x34/0x38 [nfs]
[<ffffffffa010cea7>] nfs_commit_inode+0x71/0x1d6 [nfs]
[<ffffffffa00ff128>] nfs_release_page+0x66/0x83 [nfs]
[<ffffffff810d2d47>] try_to_release_page+0x32/0x3b
[<ffffffff810de9ec>] shrink_page_list+0x2cf/0x446
[<ffffffff810deeb0>] shrink_inactive_list.clone.35+0x34d/0x5c6
[<ffffffff810df732>] shrink_zone+0x355/0x3e2
[<ffffffff810dfbaa>] do_try_to_free_pages+0x160/0x363
[<ffffffff810dff46>] try_to_free_pages+0x67/0x69
[<ffffffff810da1d3>] __alloc_pages_nodemask+0x525/0x776
[<ffffffff81100015>] alloc_pages_current+0xa9/0xc3
[<ffffffff813fc9cc>] tcp_sendmsg+0x3c5/0x809
[<ffffffff813aefe3>] __sock_sendmsg+0x6b/0x77
[<ffffffff813afd99>] sock_aio_write+0xc2/0xd6
[<ffffffff811176aa>] do_sync_readv_writev+0xc1/0x100
[<ffffffff81117900>] do_readv_writev+0xa7/0x127
[<ffffffff811179c5>] vfs_writev+0x45/0x47
[<ffffffff81117ae8>] sys_writev+0x4a/0x93
[<ffffffff81009cf2>] system_call_fastpath+0x16/0x1b
[<ffffffffffffffff>] 0xffffffffffffffff

# cat /proc/1655/stack
[<ffffffffa0100a6c>] nfs_wait_bit_killable+0x34/0x38 [nfs]
[<ffffffffa010cfd6>] nfs_commit_inode+0x1a0/0x1d6 [nfs]
[<ffffffffa00ff128>] nfs_release_page+0x66/0x83 [nfs]
[<ffffffff810d2d47>] try_to_release_page+0x32/0x3b
[<ffffffff810de9ec>] shrink_page_list+0x2cf/0x446
[<ffffffff810deeb0>] shrink_inactive_list.clone.35+0x34d/0x5c6
[<ffffffff810df732>] shrink_zone+0x355/0x3e2
[<ffffffff810dfbaa>] do_try_to_free_pages+0x160/0x363
[<ffffffff810dff46>] try_to_free_pages+0x67/0x69
[<ffffffff810da1d3>] __alloc_pages_nodemask+0x525/0x776
[<ffffffff81100015>] alloc_pages_current+0xa9/0xc3
[<ffffffff810d3aa3>] __page_cache_alloc+0x77/0x7e
[<ffffffff810d3c6c>] grab_cache_page_write_begin+0x5c/0xa3
[<ffffffffa00ff41e>] nfs_write_begin+0xd4/0x187 [nfs]
[<ffffffff810d2f27>] generic_file_buffered_write+0xfa/0x23d
[<ffffffff810d498c>] __generic_file_aio_write+0x24f/0x27f
[<ffffffff810d4a17>] generic_file_aio_write+0x5b/0xab
[<ffffffffa010001f>] nfs_file_write+0xe0/0x172 [nfs]
[<ffffffff81116bf6>] do_sync_write+0xcb/0x108
[<ffffffff811172d0>] vfs_write+0xac/0x100
[<ffffffff811174d9>] sys_write+0x4a/0x6e
[<ffffffff81009cf2>] system_call_fastpath+0x16/0x1b
[<ffffffffffffffff>] 0xffffffffffffffff


[Index of Archives]     [Gluster Development]     [Linux Filesytems Development]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux