Re: glusterfsd process spinning

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On 06/04/2014 08:07 AM, Susant Palai wrote:
Pranith can you send the client and bricks logs.
I have the logs. But I believe for this issue of directory not listing entries, it would help more if we have the contents of that directory on all the directories in the bricks + their hash values in the xattrs.

Pranith

Thanks,
Susant~

----- Original Message -----
From: "Pranith Kumar Karampuri" <pkarampu@xxxxxxxxxx>
To: "Franco Broi" <franco.broi@xxxxxxxxxx>
Cc: gluster-users@xxxxxxxxxxx, "Raghavendra Gowdappa" <rgowdapp@xxxxxxxxxx>, spalai@xxxxxxxxxx, kdhananj@xxxxxxxxxx, vsomyaju@xxxxxxxxxx, nbalacha@xxxxxxxxxx
Sent: Wednesday, 4 June, 2014 7:53:41 AM
Subject: Re:  glusterfsd process spinning

hi Franco,
       CC Devs who work on DHT to comment.

Pranith

On 06/04/2014 07:39 AM, Franco Broi wrote:
On Wed, 2014-06-04 at 07:28 +0530, Pranith Kumar Karampuri wrote:
Franco,
         Thanks for providing the logs. I just copied over the logs to my
machine. Most of the logs I see are related to "No such File or
Directory" I wonder what lead to this. Do you have any idea?
No but I'm just looking at my 3.5 Gluster volume and it has a directory
that looks empty but can't be deleted. When I look at the directories on
the servers there are definitely files in there.

[franco@charlie1 franco]$ rmdir /data2/franco/dir1226/dir25
rmdir: failed to remove `/data2/franco/dir1226/dir25': Directory not empty
[franco@charlie1 franco]$ ls -la  /data2/franco/dir1226/dir25
total 8
drwxrwxr-x 2 franco support 60 May 21 03:58 .
drwxrwxr-x 3 franco support 24 Jun  4 09:37 ..

[root@nas6 ~]# ls -la /data*/gvol/franco/dir1226/dir25
/data21/gvol/franco/dir1226/dir25:
total 2081
drwxrwxr-x 13 1348 200 13 May 21 03:58 .
drwxrwxr-x  3 1348 200  3 May 21 03:58 ..
drwxrwxr-x  2 1348 200  2 May 16 12:05 dir13017
drwxrwxr-x  2 1348 200  2 May 16 12:05 dir13018
drwxrwxr-x  2 1348 200  3 May 16 12:05 dir13020
drwxrwxr-x  2 1348 200  3 May 16 12:05 dir13021
drwxrwxr-x  2 1348 200  3 May 16 12:05 dir13022
drwxrwxr-x  2 1348 200  2 May 16 12:05 dir13024
drwxrwxr-x  2 1348 200  2 May 16 12:05 dir13027
drwxrwxr-x  2 1348 200  3 May 16 12:05 dir13028
drwxrwxr-x  2 1348 200  2 May 16 12:06 dir13029
drwxrwxr-x  2 1348 200  2 May 16 12:06 dir13031
drwxrwxr-x  2 1348 200  3 May 16 12:06 dir13032

/data22/gvol/franco/dir1226/dir25:
total 2084
drwxrwxr-x 13 1348 200 13 May 21 03:58 .
drwxrwxr-x  3 1348 200  3 May 21 03:58 ..
drwxrwxr-x  2 1348 200  2 May 16 12:05 dir13017
drwxrwxr-x  2 1348 200  2 May 16 12:05 dir13018
drwxrwxr-x  2 1348 200  2 May 16 12:05 dir13020
drwxrwxr-x  2 1348 200  2 May 16 12:05 dir13021
drwxrwxr-x  2 1348 200  2 May 16 12:05 dir13022
.....

Maybe Gluster is losing track of the files??

Pranith

On 06/02/2014 02:48 PM, Franco Broi wrote:
Hi Pranith

Here's a listing of the brick logs, looks very odd especially the size
of the log for data10.

[root@nas3 bricks]# ls -ltrh
total 2.6G
-rw------- 1 root root 381K May 13 12:15 data12-gvol.log-20140511
-rw------- 1 root root 430M May 13 12:15 data11-gvol.log-20140511
-rw------- 1 root root 328K May 13 12:15 data9-gvol.log-20140511
-rw------- 1 root root 2.0M May 13 12:15 data10-gvol.log-20140511
-rw------- 1 root root    0 May 18 03:43 data10-gvol.log-20140525
-rw------- 1 root root    0 May 18 03:43 data11-gvol.log-20140525
-rw------- 1 root root    0 May 18 03:43 data12-gvol.log-20140525
-rw------- 1 root root    0 May 18 03:43 data9-gvol.log-20140525
-rw------- 1 root root    0 May 25 03:19 data10-gvol.log-20140601
-rw------- 1 root root    0 May 25 03:19 data11-gvol.log-20140601
-rw------- 1 root root    0 May 25 03:19 data9-gvol.log-20140601
-rw------- 1 root root  98M May 26 03:04 data12-gvol.log-20140518
-rw------- 1 root root    0 Jun  1 03:37 data10-gvol.log
-rw------- 1 root root    0 Jun  1 03:37 data11-gvol.log
-rw------- 1 root root    0 Jun  1 03:37 data12-gvol.log
-rw------- 1 root root    0 Jun  1 03:37 data9-gvol.log
-rw------- 1 root root 1.8G Jun  2 16:35 data10-gvol.log-20140518
-rw------- 1 root root 279M Jun  2 16:35 data9-gvol.log-20140518
-rw------- 1 root root 328K Jun  2 16:35 data12-gvol.log-20140601
-rw------- 1 root root 8.3M Jun  2 16:35 data11-gvol.log-20140518

Too big to post everything.

Cheers,

On Sun, 2014-06-01 at 22:00 -0400, Pranith Kumar Karampuri wrote:
----- Original Message -----
From: "Pranith Kumar Karampuri" <pkarampu@xxxxxxxxxx>
To: "Franco Broi" <franco.broi@xxxxxxxxxx>
Cc: gluster-users@xxxxxxxxxxx
Sent: Monday, June 2, 2014 7:01:34 AM
Subject: Re:  glusterfsd process spinning



----- Original Message -----
From: "Franco Broi" <franco.broi@xxxxxxxxxx>
To: "Pranith Kumar Karampuri" <pkarampu@xxxxxxxxxx>
Cc: gluster-users@xxxxxxxxxxx
Sent: Sunday, June 1, 2014 10:53:51 AM
Subject: Re:  glusterfsd process spinning


The volume is almost completely idle now and the CPU for the brick
process has returned to normal. I've included the profile and I think it
shows the latency for the bad brick (data12) is unusually high, probably
indicating the filesystem is at fault after all??
I am not sure if we can believe the outputs now that you say the brick
returned to normal. Next time it is acting up, do the same procedure and
post the result.
On second thought may be its not a bad idea to inspect the log files of the bricks in nas3. Could you post them.

Pranith

Pranith
On Sun, 2014-06-01 at 01:01 -0400, Pranith Kumar Karampuri wrote:
Franco,
       Could you do the following to get more information:

"gluster volume profile <volname> start"

Wait for some time, this will start gathering what operations are coming
to
all the bricks"
Now execute "gluster volume profile <volname> info" >
/file/you/should/reply/to/this/mail/with

Then execute:
gluster volume profile <volname> stop

Lets see if this throws any light on the problem at hand

Pranith
----- Original Message -----
From: "Franco Broi" <franco.broi@xxxxxxxxxx>
To: gluster-users@xxxxxxxxxxx
Sent: Sunday, June 1, 2014 9:02:48 AM
Subject:  glusterfsd process spinning

Hi

I've been suffering from continual problems with my gluster filesystem
slowing down due to what I thought was congestion on a single brick
being caused by a problem with the underlying filesystem running slow
but I've just noticed that the glusterfsd process for that particular
brick is running at 100%+, even when the filesystem is almost idle.

I've done a couple of straces of the brick and another on the same
server, does the high number of futex errors give any clues as to what
might be wrong?

% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
45.58    0.027554           0    191665     20772 futex
28.26    0.017084           0    137133           readv
26.04    0.015743           0     66259           epoll_wait
     0.13    0.000077           3        23           writev
     0.00    0.000000           0         1           epoll_ctl
------ ----------- ----------- --------- --------- ----------------
100.00    0.060458                395081     20772 total

% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
99.25    0.334020         133      2516           epoll_wait
     0.40    0.001347           0      4090        26 futex
     0.35    0.001192           0      5064           readv
     0.00    0.000000           0        20           writev
------ ----------- ----------- --------- --------- ----------------
100.00    0.336559                 11690        26 total



Cheers,

_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://supercolony.gluster.org/mailman/listinfo/gluster-users


_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://supercolony.gluster.org/mailman/listinfo/gluster-users




[Index of Archives]     [Gluster Development]     [Linux Filesytems Development]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux