Re: Gluster high RPC calls and reply

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On 07/07/2014 07:39 PM, Gurdeep Singh (Guru) wrote:
Hello Pranith,

Process 18629 is not sending any traffic across the servers. 

The process that are constantly sending packet across is 1055 & 18611.
In that case it is the application which is sending the traffic. Niels just looked into the pcap file and even he found the process with pid 14927 to be the one sending the traffic. Could you check what process it is?

Pranith.

If any, what is the interval of RPC lookup on one file? can we somehow control the lookup frequency?

Thanks,
Gurdeep.



On 7 Jul 2014, at 11:59 pm, Pranith Kumar Karampuri <pkarampu@xxxxxxxxxx> wrote:


On 07/07/2014 07:22 PM, Gurdeep Singh (Guru) wrote:
[guru@srv2 ~]$ ps -v 1055
Warning: bad syntax, perhaps a bogus '-'? See /usr/share/doc/procps-3.2.8/FAQ
  PID TTY      STAT   TIME  MAJFL   TRS   DRS   RSS %MEM COMMAND
 1055 ?        Ssl   86:01     31     0 319148 33092  3.2 /usr/sbin/glusterfs --volfile-server=srv2 --volfile-id=/gv0 /var/www/html/image/
[guru@srv2 ~]$ 

Gurdeep,
       Don't see anything odd here :-(. Mount is looking up files and brick is serving it. Why don't you keep a watch on the process '18629' and similar processes in the cluster. Do a ps aux | grep glustershd to get the pids. There will be one such process per machine in the cluster. Check how much it consumes. That is the only process which does operations without any operations on mounts.

Pranith



On 7 Jul 2014, at 11:49 pm, Pranith Kumar Karampuri <pkarampu@xxxxxxxxxx> wrote:


On 07/07/2014 07:03 PM, Gurdeep Singh (Guru) wrote:
Hello Niels,

I did a net hogs on the interface to see what process might be using the bandwidth,

NetHogs version 0.8.0

  PID USER     PROGRAM                                                                                                  DEV        SENT      RECEIVED       
18611 root     /usr/sbin/glusterfsd                                                                                     tun0   16.307      17.547 KB/sec
1055  root     /usr/sbin/glusterfs                                                                                      tun0   17.249      16.259 KB/sec
13439 guru     sshd: guru@pts/0                                                                                         tun0   0.966       0.051 KB/sec
18625 root     /usr/sbin/glusterfs                                                                                      tun0   0.000       0.000 KB/sec
18629 root     /usr/sbin/glusterfs                                                                                      tun0   0.000       0.000 KB/sec
9636  root     /usr/sbin/glusterd                                                                                       tun0   0.000       0.000 KB/sec
?     root     unknown TCP                                                                                                         0.000       0.000 KB/sec

  TOTAL                                                                                                                           34.523      33.856 KB/sec 




Which process corresponds to '1055'?

Pranith
Its glusterfs and glusterfsd process.

I looked at the capture file and see that the lookup is being made on random files.

For PID information, please see this:

[guru@srv2 ~]$ sudo netstat -tpn | grep 49152
tcp        0      0 127.0.0.1:49152             127.0.0.1:1012              ESTABLISHED 18611/glusterfsd    
tcp        0      0 127.0.0.1:49152             127.0.0.1:1016              ESTABLISHED 18611/glusterfsd    
tcp        0      0 127.0.0.1:1016              127.0.0.1:49152             ESTABLISHED 18625/glusterfs     
tcp        0      0 10.8.0.6:1021               10.8.0.1:49152              ESTABLISHED 1055/glusterfs      
tcp        0      0 10.8.0.6:49152              10.8.0.1:1017               ESTABLISHED 18611/glusterfsd    
tcp        0      0 10.8.0.6:1020               10.8.0.1:49152              ESTABLISHED 18629/glusterfs     
tcp        0      0 127.0.0.1:1023              127.0.0.1:49152             ESTABLISHED 18629/glusterfs     
tcp        0      0 10.8.0.6:49152              10.8.0.1:1022               ESTABLISHED 18611/glusterfsd    
tcp        0      0 10.8.0.6:49152              10.8.0.1:1021               ESTABLISHED 18611/glusterfsd    
tcp        0      0 127.0.0.1:49152             127.0.0.1:1023              ESTABLISHED 18611/glusterfsd    
tcp        0      0 127.0.0.1:1012              127.0.0.1:49152             ESTABLISHED 1055/glusterfs      
tcp        0      0 10.8.0.6:1019               10.8.0.1:49152              ESTABLISHED 18625/glusterfs     
[guru@srv2 ~]$ ps -v 18611
Warning: bad syntax, perhaps a bogus '-'? See /usr/share/doc/procps-3.2.8/FAQ
  PID TTY      STAT   TIME  MAJFL   TRS   DRS   RSS %MEM COMMAND
18611 ?        Ssl   14:12      0     0 650068 20404  2.0 /usr/sbin/glusterfsd -s srv2 --volfile-id gv0.srv2.root-gluster-vol0 -p /var/lib/glusterd/vols/gv0
[guru@srv2 ~]$ ps -v 18629
Warning: bad syntax, perhaps a bogus '-'? See /usr/share/doc/procps-3.2.8/FAQ
  PID TTY      STAT   TIME  MAJFL   TRS   DRS   RSS %MEM COMMAND
18629 ?        Ssl    0:04      0     0 333296 17380  1.7 /usr/sbin/glusterfs -s localhost --volfile-id gluster/glustershd -p /var/lib/glusterd/glustershd/r
[guru@srv2 ~]$ 
[guru@srv2 ~]$ 
[guru@srv2 ~]$ ps -v 18629
Warning: bad syntax, perhaps a bogus '-'? See /usr/share/doc/procps-3.2.8/FAQ
  PID TTY      STAT   TIME  MAJFL   TRS   DRS   RSS %MEM COMMAND
18629 ?        Ssl    0:04      0     0 333296 17380  1.7 /usr/sbin/glusterfs -s localhost --volfile-id gluster/glustershd -p /var/lib/glusterd/glustershd/run/glustershd.pid -l /var/log/glusterfs/glustershd.log -S /var/run/823fa3197e2d1841be888
[guru@srv2 ~]$ 
[guru@srv2 ~]$ 
[guru@srv2 ~]$ 
[guru@srv2 ~]$ 
[guru@srv2 ~]$ ps -v 18629
Warning: bad syntax, perhaps a bogus '-'? See /usr/share/doc/procps-3.2.8/FAQ
  PID TTY      STAT   TIME  MAJFL   TRS   DRS   RSS %MEM COMMAND
18629 ?        Ssl    0:04      0     0 333296 17380  1.7 /usr/sbin/glusterfs -s localhost --volfile-id gluster/glustershd -p /var/lib/glusterd/glustershd/run/glustershd.pid -l /var/log/glusterfs/glustershd.log -S /var/run/823fa3197e2d1841be8881500723b063.socket --xlator-option *replicate*.node-uuid=84af83c9-0a29-
[guru@srv2 ~]$ 
[guru@srv2 ~]$ 
[guru@srv2 ~]$ ps -v 18625
Warning: bad syntax, perhaps a bogus '-'? See /usr/share/doc/procps-3.2.8/FAQ
  PID TTY      STAT   TIME  MAJFL   TRS   DRS   RSS %MEM COMMAND
18625 ?        Ssl    0:03      0     0 239528 41040  4.0 /usr/sbin/glusterfs -s localhost --volfile-id gluster/nfs -p /var/lib/glusterd/nfs/run/nfs.pid -l /var/log/glusterfs/nfs.log -S /var/run/5ad5b036fd636cc5dddffa73593e4089.socket
[guru@srv2 ~]$ sudo nethogs tun0
Waiting for first packet to arrive (see sourceforge.net bug 1019381)
[guru@srv2 ~]$ rpm -qa | grep gluster
glusterfs-3.5.1-1.el6.x86_64
glusterfs-cli-3.5.1-1.el6.x86_64
glusterfs-libs-3.5.1-1.el6.x86_64
glusterfs-fuse-3.5.1-1.el6.x86_64
glusterfs-server-3.5.1-1.el6.x86_64
glusterfs-api-3.5.1-1.el6.x86_64
[guru@srv2 ~]$ 


I don’t see anything odd here. Please suggest.

Thanks,
Gurdeep.





On 7 Jul 2014, at 9:06 pm, Niels de Vos <ndevos@xxxxxxxxxx> wrote:

On Sun, Jul 06, 2014 at 11:28:51PM +1000, Gurdeep Singh (Guru) wrote:
Hello,

I have setup gluster in replicate type and its working fine.

I am seeing a constant chatting between the hosts for lookup call and
lookup reply. I am trying to understand as to why this traffic is
being initiated constantly. Please look at the attached image. This
traffic is using around 200KB/s of constant bandwidth and is
exhausting our allocated monthly bandwidth on our 2 VPS.

You can use Wireshark to identify which process does the LOOKUP calls.  
For this, do the following:

1. select a LOOKUP Call
2. enable the 'packet details' pane (found in the main menu, 'view')
3. expand the 'Transmission Control Protocol' tree
4. check the 'Source port' of the LOOKUP Call

Together with the 'Source' and the 'Source port' you can go to the
server that matches the 'Source' address. A command like this would give
you the PID of the process in the right column:

 # netstat -tpn | grep $SOURCE_PORT

And with 'ps -v $PID' you can check which process is responsible for the
LOOKUP. This process can be a fuse-mount, self-heal-daemon or any other
glusterfs-client. Depending on the type of client, you maybe can tune
the workload or other options a little.

In Wireshark you can also check what filename is LOOKUP'd, just expand
the 'GlusterFS' part in the 'packet details' and check the 'Basename'.  
Maybe this filename (without directory structure) does give you any
ideas of which activity is causing the LOOKUPs.

HTH,
Niels


The configuration I have for Gluster is:

[guru@srv1 ~]$ sudo gluster volume info
[sudo] password for guru:

Volume Name: gv0
Type: Replicate
Volume ID: dc8dc3f2-f5bd-4047-9101-acad04695442
Status: Started
Number of Bricks: 1 x 2 = 2
Transport-type: tcp
Bricks:
Brick1: srv1:/root/gluster-vol0
Brick2: srv2:/root/gluster-vol0
Options Reconfigured:
cluster.lookup-unhashed: on
performance.cache-refresh-timeout: 60
performance.cache-size: 1GB
storage.health-check-interval: 30



Please suggest how to fine tune the RPC calls/reply.





_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://supercolony.gluster.org/mailman/listinfo/gluster-users





_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://supercolony.gluster.org/mailman/listinfo/gluster-users

[Index of Archives]     [Gluster Development]     [Linux Filesytems Development]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux