[SPAM?] Re: hangs on accessing files on gluster mount (3.2.0)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Tomasz.

I had a very similar issue with reading certain file would hang the reading process on the client - but other files - even in the same directory - would not hang.

My setup is configured thusly:
GlusterFS 3.1.3 running on Linux CentOS 5.5
Upgraded from 3.0.4 to 3.1.1 initially (data was rsynced from NFS sources to the 3.1.1 storage servers), then later 3.1.3 just before the activities listed in my solution.

Volume Name: pfs-ro1
Type: Distributed-Replicate
Status: Started
Number of Bricks: 20 x 2 = 40
Transport-type: tcp
Brick1: jc1letgfs17-pfs1:/export/read-only/g01
Brick2: jc1letgfs18-pfs1:/export/read-only/g01
...
Brick39: jc1letgfs14-pfs1:/export/read-only/g10
Brick40: jc1letgfs15-pfs1:/export/read-only/g10

Some things I observed:

When I examined the problem files on the backend, sometimes I would see both the actual file (that is, it had normal permissions and ownership) as well as the pointer file (which would have permissions that looked like this -------T ) would both be on the same backend server. Normally, if a client made a reference to a file that was stored on Server B but requested from Server A, it seems that GlusterFS would create one of these 0000 permission files with extended attributes that would point to the actual file on Server B - effectively, a symbolic link to the actual location.

I never completely pinned down the root cause of this, but various people helped me troubleshoot to the point that I could identify these files as problems.

Another occurrence was that I would find two of the "symbolic link / 0000 perm" files on one pair of mirrors but not actual files with content on the other pair.

My ultimate solution (though painful was this:)

Begin with a fresh rsync of the data from the sources to new clean directories mounted on a Gluster native client.

1)	Stop gluster-client on all hosts in the attached list.
2)	Stop gluster volume pfs-ro1 on servers
3)	Start gluster volume pfs-ro1 on servers
4)	Start gluster-client on a single client
5)	Rename current /pfs2/some_dir/2009 to 2009.old
6)	Rename current /pfs2/some_dir/2010 to 2010.old
7)	Rename /pfs2/some_dir/test/2009 to /pfs2/some_dir/2009
8)	Rename /pfs2/some_dir/test/2010 to /pfs2/some_dir/2010
9)	Upgrade clients running older release 3.1.1 release of GlusterFS to 3.1.3 (list attached)
10)	Start gluster-client on all hosts in the first attached list

Hopefully this helps. I wish I could say definitively what the core problems were - I believe from conversations with Jeff Darcy, Mohit, and others that my extended attributes on the backend servers for the top level directories (e.g. /export/read-only/g1 throug g10) were corrupted at some point during a gluster volume rebalance activity - probably after I had brought up the initial pair of mirrors and then added another pair.

James Burnash, Unix Engineering

-----Original Message-----
From: gluster-users-bounces at gluster.org [mailto:gluster-users-bounces at gluster.org] On Behalf Of Tomasz Chmielewski
Sent: Monday, May 09, 2011 12:30 PM
To: Mohit Anchlia
Cc: Gluster General Discussion List
Subject: [SPAM?] Re: hangs on accessing files on gluster mount (3.2.0)
Importance: Low

On 09.05.2011 18:26, Mohit Anchlia wrote:
> Are you able to read the file after you re-start gluster?

After I restart gluster, I'm able to read the files just fine.

To clarify - when it hangs, it hangs on accessing just some files.

I.e. I'm able to read /mnt/gluster/some/file/1.gif, but not 
/mnt/gluster/some/file/2.gif


> Can you try
> to read using strace for the files you see hanging and post it here?
> It might help developers to take a look.
>
> I also suggest opening a bug since it looks like a critical issue.

I'll try to do some more debugging next time I see it.

-- 
Tomasz Chmielewski
http://wpkg.org
_______________________________________________
Gluster-users mailing list
Gluster-users at gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


DISCLAIMER: 
This e-mail, and any attachments thereto, is intended only for use by the addressee(s) named herein and may contain legally privileged and/or confidential information. If you are not the intended recipient of this e-mail, you are hereby notified that any dissemination, distribution or copying of this e-mail, and any attachments thereto, is strictly prohibited. If you have received this in error, please immediately notify me and permanently delete the original and any copy of any e-mail and any printout thereof. E-mail transmission cannot be guaranteed to be secure or error-free. The sender therefore does not accept liability for any errors or omissions in the contents of this message which arise as a result of e-mail transmission. 
NOTICE REGARDING PRIVACY AND CONFIDENTIALITY Knight Capital Group may, at its discretion, monitor and review the content of all e-mail communications. http://www.knight.com


[Index of Archives]     [Gluster Development]     [Linux Filesytems Development]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux