Re: glusterfsd Call Trace Messages

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Am 2016-02-03 21:24, schrieb Raghavendra Bhat:
I think this is what is happening. Someone please correct me if I am
wrong.

I think this is happening because nfs client, nfs server and bricks
being in the same machine. What happens is, when the large write
comes, nfs client sends the request to the nfs server and the nfs
server sends it to the brick. The brick process tries to write it via
making the write system call and the call enters the kernel. Kernel
might not find memory available for performing the operation and thus
wants to free some memory. NFS client does heavy caching. It might
have saved many things in its memory. So, it has to free some memory.
But nfs client is stuck with the write operation. It is still waiting
for a response from the server. So it will not be able to free the
memory till it gets a response from the nfs server (which in turn is
waiting for a response from the brick) for the write operation it
sent. But brick cannot get a response from kernel until kernel is able
to get some memory for the operation and perform write.

Thus it is stuck in this deadlock. Thats why you see your setup
blocked.

Can you please mount your volume via nfs on a different node other
than the gluster server, and see if the issue happens again?

Regards,
Raghavendra

On Wed, Feb 3, 2016 at 2:32 PM, Taste-Of-IT <kontakt@xxxxxxxxxxxxxx>
wrote:

Am 2016-02-03 20:09, schrieb Raghavendra Bhat:

Hi,

Is your nfs client mounted on one of the gluster serves? 

Regards,
Raghavendra

On Wed, Feb 3, 2016 at 10:08 AM, Taste-Of-IT
<kontakt@xxxxxxxxxxxxxx>
wrote:

Hello,

hope some expert can help. I have a 2 Brick 1 Volume Distributed
GlusterFS in Version 3.7.6 on Debian. The volume is shared via nfs.
If i copy via midnight commander large files (>30GB), i got
following messages. I replace sata cable, checked memory but i
didnt
find an error. SMART Values on all disks seems ok. After 30-40
minutes i can copy again. Any Idea?

Feb  3 12:46:31 gluster01 kernel: [11186.588367] [sched_delayed]
sched: RT throttling activated
Feb  3 12:56:09 gluster01 kernel: [11764.932749] glusterfsd   
  D ffff88040ca6d788     0  1150      1 0x00000000
Feb  3 12:56:09 gluster01 kernel: [11764.932759] 
ffff88040ca6d330 0000000000000082 0000000000012f00 ffff88040ad1bfd8
Feb  3 12:56:09 gluster01 kernel: [11764.932767] 
0000000000012f00 ffff88040ca6d330 ffff88040ca6d330 ffff88040ad1be88
Feb  3 12:56:09 gluster01 kernel: [11764.932773] 
ffff88040e18d4b8 ffff88040e18d4a0 ffffffff00000000 ffff88040e18d4a8
Feb  3 12:56:09 gluster01 kernel: [11764.932780] Call Trace:
Feb  3 12:56:09 gluster01 kernel: [11764.932796] 
[<ffffffff81512cd5>] ? rwsem_down_write_failed+0x1d5/0x320
Feb  3 12:56:09 gluster01 kernel: [11764.932807] 
[<ffffffff812b7d13>] ? call_rwsem_down_write_failed+0x13/0x20
Feb  3 12:56:09 gluster01 kernel: [11764.932816] 
[<ffffffff812325b0>] ? proc_keys_show+0x3f0/0x3f0
Feb  3 12:56:09 gluster01 kernel: [11764.932823] 
[<ffffffff81512649>] ? down_write+0x29/0x40
Feb  3 12:56:09 gluster01 kernel: [11764.932830] 
[<ffffffff811592bc>] ? vm_mmap_pgoff+0x6c/0xc0
Feb  3 12:56:09 gluster01 kernel: [11764.932838] 
[<ffffffff8116ea4e>] ? SyS_mmap_pgoff+0x10e/0x250
Feb  3 12:56:09 gluster01 kernel: [11764.932844] 
[<ffffffff811a969a>] ? SyS_readv+0x6a/0xd0
Feb  3 12:56:09 gluster01 kernel: [11764.932853] 
[<ffffffff81513ccd>] ? system_call_fast_compare_end+0x10/0x15
Feb  3 12:58:09 gluster01 kernel: [11884.979935] glusterfsd   
  D ffff88040ca6d788     0  1150      1 0x00000000
Feb  3 12:58:09 gluster01 kernel: [11884.979945] 
ffff88040ca6d330 0000000000000082 0000000000012f00 ffff88040ad1bfd8
Feb  3 12:58:09 gluster01 kernel: [11884.979952] 
0000000000012f00 ffff88040ca6d330 ffff88040ca6d330 ffff88040ad1be88
Feb  3 12:58:09 gluster01 kernel: [11884.979959] 
ffff88040e18d4b8 ffff88040e18d4a0 ffffffff00000000 ffff88040e18d4a8
Feb  3 12:58:09 gluster01 kernel: [11884.979966] Call Trace:
Feb  3 12:58:09 gluster01 kernel: [11884.979982] 
[<ffffffff81512cd5>] ? rwsem_down_write_failed+0x1d5/0x320
Feb  3 12:58:09 gluster01 kernel: [11884.979993] 
[<ffffffff812b7d13>] ? call_rwsem_down_write_failed+0x13/0x20
Feb  3 12:58:09 gluster01 kernel: [11884.980001] 
[<ffffffff812325b0>] ? proc_keys_show+0x3f0/0x3f0
Feb  3 12:58:09 gluster01 kernel: [11884.980008] 
[<ffffffff81512649>] ? down_write+0x29/0x40
Feb  3 12:58:09 gluster01 kernel: [11884.980015] 
[<ffffffff811592bc>] ? vm_mmap_pgoff+0x6c/0xc0
Feb  3 12:58:09 gluster01 kernel: [11884.980023] 
[<ffffffff8116ea4e>] ? SyS_mmap_pgoff+0x10e/0x250
Feb  3 12:58:09 gluster01 kernel: [11884.980030] 
[<ffffffff811a969a>] ? SyS_readv+0x6a/0xd0
Feb  3 12:58:09 gluster01 kernel: [11884.980038] 
[<ffffffff81513ccd>] ? system_call_fast_compare_end+0x10/0x15
Feb  3 12:58:09 gluster01 kernel: [11884.980351] mc         
    D ffff88040e6d8fb8     0  5119   1447 0x00000000
Feb  3 12:58:09 gluster01 kernel: [11884.980358] 
ffff88040e6d8b60 0000000000000082 0000000000012f00 ffff88040d5dbfd8
Feb  3 12:58:09 gluster01 kernel: [11884.980365] 
0000000000012f00 ffff88040e6d8b60 ffff88041ec937b0 ffff88041efcc9e8
Feb  3 12:58:09 gluster01 kernel: [11884.980371] 
0000000000000002 ffffffff8113ce00 ffff88040d5dbcb0 ffff88040d5dbd98
Feb  3 12:58:09 gluster01 kernel: [11884.980377] Call Trace:
Feb  3 12:58:09 gluster01 kernel: [11884.980385] 
[<ffffffff8113ce00>] ? wait_on_page_read+0x60/0x60
Feb  3 12:58:09 gluster01 kernel: [11884.980392] 
[<ffffffff81510759>] ? io_schedule+0x99/0x120
Feb  3 12:58:09 gluster01 kernel: [11884.980399] 
[<ffffffff8113ce0a>] ? sleep_on_page+0xa/0x10
Feb  3 12:58:09 gluster01 kernel: [11884.980405] 
[<ffffffff81510adc>] ? __wait_on_bit+0x5c/0x90
Feb  3 12:58:09 gluster01 kernel: [11884.980412] 
[<ffffffff8113cbff>] ? wait_on_page_bit+0x7f/0x90
Feb  3 12:58:09 gluster01 kernel: [11884.980420] 
[<ffffffff810a7bd0>] ? autoremove_wake_function+0x30/0x30
Feb  3 12:58:09 gluster01 kernel: [11884.980426] 
[<ffffffff8114a17d>] ? pagevec_lookup_tag+0x1d/0x30
Feb  3 12:58:09 gluster01 kernel: [11884.980433] 
[<ffffffff8113cce0>] ? filemap_fdatawait_range+0xd0/0x160
Feb  3 12:58:09 gluster01 kernel: [11884.980442] 
[<ffffffff8113e7ca>] ? filemap_write_and_wait_range+0x3a/0x60
Feb  3 12:58:09 gluster01 kernel: [11884.980461] 
[<ffffffffa072363f>] ? nfs_file_fsync+0x7f/0x100 [nfs]
Feb  3 12:58:09 gluster01 kernel: [11884.980476] 
[<ffffffffa0723a2a>] ? nfs_file_write+0xda/0x1a0 [nfs]
Feb  3 12:58:09 gluster01 kernel: [11884.980484] 
[<ffffffff811a7e24>] ? new_sync_write+0x74/0xa0
Feb  3 12:58:09 gluster01 kernel: [11884.980492] 
[<ffffffff811a8562>] ? vfs_write+0xb2/0x1f0
Feb  3 12:58:09 gluster01 kernel: [11884.980500] 
[<ffffffff811a842d>] ? vfs_read+0xed/0x170
Feb  3 12:58:09 gluster01 kernel: [11884.980505] 
[<ffffffff811a90a2>] ? SyS_write+0x42/0xa0
Feb  3 12:58:09 gluster01 kernel: [11884.980513] 
[<ffffffff81513ccd>] ? system_call_fast_compare_end+0x10/0x15

_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://www.gluster.org/mailman/listinfo/gluster-users [1] [1]

Links:
------
[1] http://www.gluster.org/mailman/listinfo/gluster-users [1]
 Hi Raghavendra,
 yes in this case i have to mount on one of the gluster server, but it
doesnt matter on which i mount and its only a question of time when
the trace came.
 Taste

 _______________________________________________
 Gluster-users mailing list
 Gluster-users@xxxxxxxxxxx
 http://www.gluster.org/mailman/listinfo/gluster-users [1]


Links:
------
[1] http://www.gluster.org/mailman/listinfo/gluster-users

Hi,
sounds logical. Is that a normal behavior? I tested it from a client and it looks fine, without trace. I tried 4 files about 30GB. The only thing i notice is, that the first file was copied with nearly full bandwidth, over both server, but the second was only with 20-30 Percent of possible bandwith. are there any perforamnce / stable option which i can use for nfs or glusterfs mount?
_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://www.gluster.org/mailman/listinfo/gluster-users




[Index of Archives]     [Gluster Development]     [Linux Filesytems Development]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux