Re: Self-heal Problems with gluster and nfs

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



It seems like entry self-heal is happening. What is the volume configuration? Could you give
ls <brick-path>/.glusterfs/indices/xattrop | wc -l
Count for all the bricks

Pranith
On 07/08/2014 03:36 PM, Norman Mähler wrote:
Hello Pranith,

here are the logs. I only giv you the last 3000 lines, because the
nfs.log from today is already 550 MB.

There are the standard files from a user home on the gluster system. All
you normally find in a user home. Config files, firefox and thunderbird
files etc.

Thanks in advance
Norman

Am 08.07.2014 11:46, schrieb Pranith Kumar Karampuri:
On 07/08/2014 02:46 PM, Norman Mähler wrote:
Hello again,

i could resolve the self heal problems with the missing gfid files on
one of the servers by deleting the gfid files on the other server.

They had a link count of 1 which means that the file on that the gfid
pointed was already deleted.


We have still these errors

[2014-07-08 09:09:43.564488] W
[client-rpc-fops.c:2469:client3_3_link_cbk]
0-gluster_dateisystem-client-0: remote operation failed: File exists
(00000000-0000-0000-0000-000000000000 ->
<gfid:b338b09e-2577-45b3-82bd-032f954dd083>/lock)

which appear in the glusterfshd.log and these

[2014-07-08 09:13:31.198462] E
[client-rpc-fops.c:5179:client3_3_inodelk]
(-->/usr/lib/x86_64-linux-gnu/glusterfs/3.4.4/xlator/cluster/replicate.so(+0x466b8)

[0x7f5d29d4e6b8]
(-->/usr/lib/x86_64-linux-gnu/glusterfs/3.4.4/xlator/cluster/replicate.so(afr_lock_blocking+0x844)

[0x7f5d29d4e2e4]
(-->/usr/lib/x86_64-linux-gnu/glusterfs/3.4.4/xlator/protocol/client.so(client_inodelk+0x99)

[0x7f5d29f8b3c9]))) 0-: Assertion failed: 0

from the nfs.log.
Could you attach mount (nfs.log) and brick logs please.
Do you have files with lots of hard-links?
Pranith
I think the error messages belong together but I don't have any idea
how to solve them.

Still we have got a very bad performance issue. The system load on the
servers is above 20 and nearly no one is able to work in here on a
client...

Hope for help
Norman


Am 07.07.2014 15:39, schrieb Pranith Kumar Karampuri:
On 07/07/2014 06:58 PM, Norman Mähler wrote: Dear community,

we have got some serious problems with our Gluster installation.

Here is the setting:

We have got 2 bricks (version 3.4.4) on a debian 7.5, one of them
with an nfs export. There are about 120 clients connecting to the
exported nfs. These clients are thin clients reading and writing
their Linux home directories from the exported nfs.

We want to change the access of these clients one by one to access
via gluster client.
I did not understand what you meant by this. Are you moving to
glusterfs-fuse based mounts?

Here are our problems:

In the moment we have got two types of error messages which come
in burts to our glusterfshd.log

[2014-07-07 13:10:21.572487] W
[client-rpc-fops.c:1538:client3_3_inodelk_cbk]
0-gluster_dateisystem-client-1: remote operation failed: No such
file or directory [2014-07-07 13:10:21.573448] W
[client-rpc-fops.c:471:client3_3_open_cbk]
0-gluster_dateisystem-client-1: remote operation failed: No such
file or directory. Path:
<gfid:b0c4f78a-249f-4db7-9d5b-0902c7d8f6cc>
(00000000-0000-0000-0000-000000000000) [2014-07-07 13:10:21.573468]
E [afr-self-heal-data.c:1270:afr_sh_data_open_cbk]
0-gluster_dateisystem-replicate-0: open of
<gfid:b0c4f78a-249f-4db7-9d5b-0902c7d8f6cc> failed on child
gluster_dateisystem-client-1 (No such file or directory)


This looks like a missing gfid file on one of the bricks. I looked
it up and yes the file is missing on the second brick.

We got these messages the other way round, too (missing on
client-0 and the first brick).

Is it possible to repair this one by copying the gfid file to the
brick where it was missing? Or ist there another way to repair it?


The second message is

[2014-07-07 13:06:35.948738] W
[client-rpc-fops.c:2469:client3_3_link_cbk]
0-gluster_dateisystem-client-1: remote operation failed: File
exists (00000000-0000-0000-0000-000000000000 ->
<gfid:aae47250-8f69-480c-ac75-2da2f4d21d7a>/lock)

and I really do not know what to do with this one...
Did any of the bricks went offline and came back online?
Pranith
I am really looking forward to your help because this is an active
system and the system load on the nfs brick is about 25 (!!)

Thanks in advance! Norman Maehler


_______________________________________________ Gluster-users
mailing list Gluster-users@xxxxxxxxxxx
http://supercolony.gluster.org/mailman/listinfo/gluster-users

_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://supercolony.gluster.org/mailman/listinfo/gluster-users





[Index of Archives]     [Gluster Development]     [Linux Filesytems Development]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux