Files present on the backend but have become invisible from clients

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Jun 10, 2011, at 1:11 PM, Burnash, James wrote:

> Hi Amar.
>  
> Is there a projected release date for 3.1.5 and 3.2.1?
>  
> From: gluster-users-bounces at gluster.org [mailto:gluster-users-bounces at gluster.org] On Behalf Of Burnash, James
> Sent: Friday, May 27, 2011 1:42 PM
> To: 'Amar Tumballi'
> Cc: gluster-users at gluster.org
> Subject: Re: Files present on the backend but have become invisible from clients
>  
> Thank you Amar ? this is much appreciated and gives me a better understanding of the meaning of some of these attributes.
>  
> I would like to suggest that something at least on this level be added to the Gluster documentation for future use by others, as well as myself (I forget sometimes J)
>  
> So ? as far as what I can do about these incorrect Replicate attributes ? it appears that the answer is ?nothing until the next release?? Or will triggering a self heal on those directories specifically clean things up?
>  
> Thanks as always,
>  
> James
>  
> From: Amar Tumballi [mailto:amar at gluster.com] 
> Sent: Friday, May 27, 2011 12:23 PM
> To: Burnash, James
> Cc: Mohit Anchlia; gluster-users at gluster.org
> Subject: Re: Files present on the backend but have become invisible from clients
>  
> James,
>  
> Replies inline.
> 
> The directories are all still visible to the users, but scanning for attributes of 0sAAAAAAAAAAAAAAAA still yielded matches on the set of GlusterFS servers.
> 
> http://pastebin.com/mxvFnFj4
> 
> I tried running this command, but as you can see it wasn't happy, even though the syntax was correct:
> 
> root at jc1letgfs17:~# gluster volume rebalance pfs-ro1 fix-layout start
> Usage: volume rebalance <VOLNAME> [fix-layout|migrate-data] {start|stop|status}
> 
> I suspect this is a bug because of the "-" in my volume name. I'll test and confirm and file when I get a chance.
> 
>  
> This seems to be an bug with the 'fix-layout' CLI option itself (as i assume the version in 3.1.3, its fixed in 3.1.4+ or 3.2.0), please use just 'rebalance <VOLNAME> start'.
>  
>  
> So I just did the standard rebalance command:
>  gluster volume rebalance pfs-ro1 start
> 
> and it trundled along for a while and then one time when checked it's status, it failed:
>  date; gluster volume rebalance pfs-ro1 status
>  Thu May 26 09:02:00 EDT 2011
>  rebalance failed
> 
> I re-ran it FOUR times getting a little farther with each attempt, and it eventually completed and then started doing the actual file migration part of the rebalance:
>  Thu May 26 12:22:25 EDT 2011
>  rebalance step 1: layout fix in progress: fixed layout 779
>  Thu May 26 12:23:25 EDT 2011
>  rebalance step 2: data migration in progress: rebalanced 71 files of size 136518704 (total files scanned 57702)
> 
> Now scanning for attributes of 0sAAAAAAAAAAAAAAAA yields less results, but some are still present:
> 
>  
> Now, doing a 'rebalance' is surely not the way to heal the 'replicate' related attributes. 'rebalance' is all about fixing the 'distribute' related 'layout's and rebalancing the data within the servers.
>  
> It could have helped in resolving some of the attributes of 'replicate' as issuing a rebalance triggers a directory traversal on the volume (which is infact same as doing a 'ls -lR' or 'find' on volume). 
>  
> http://pastebin.com/x4wYq8ic
> 
> As a possible sanity check, I did this command on my Read-Write GlusterFS storage servers (2 boxes, Distributed-Replicate), and got no "bad" attributes:
>  jc1ladmin1:~/projects/gluster  loop_check ' getfattr -dm - /export/read-only/g*' jc1letgfs{13,16} | egrep "jc1letgfs|0sAAAAAAAAAAAAAAAA$|file:" | less
>  getfattr: /export/read-only/g*: No such file or directory
>  getfattr: /export/read-only/g*: No such file or directory
>  jc1letgfs13
>  jc1letgfs16
> 
> One difference in these two Storage server groups - the Read-Only group of 4 servers have their backend file systems formatted as XFS, while the Read-Write group of 2 are formatted with EXT4.
> 
> Suggestions, critiques, etc gratefully solicited.
> 
>  
> Please, next time while looking at the GlusterFS attributes use '-e hex' for 'getfattr' command. Anyways, I think the issue here is mostly due to some sort of bug which resulted in writing attributes saying 'split-brain' happened, and if that is the attribute, 'replicate' module doesn't heal anything and leaves the file as is (without even fixing the attribute). 
>  
> We are currently working on fixing these meta-data self-heal related issues right now and hope to fix many of them by 3.2.1 (and 3.1.5).


Is there a way to fix that attribute while the new version comes out?
unmounting and re-mounting doesn't seem to work, but other client nodes work fine.
[2011-06-10 18:28:08.966770] E [afr-common.c:110:afr_set_split_brain] gluster_replication-replicate-0: invalid argument: inode

$ gluster --version                                                                                                                                                                                                                                                                ~
glusterfs 3.1.2 built on Jan 18 2011 11:19:54
Repository revision: v3.1.1-64-gf2a067c

> Regards,
> Amar
>  
> James Burnash
> Unix Engineer.
>  
>  
> 
> DISCLAIMER:
> This e-mail, and any attachments thereto, is intended only for use by the addressee(s)named herein and
> may contain legally privileged and/or confidential information. If you are not the intended recipient of this
> e-mail, you are hereby notified that any dissemination, distribution or copying of this e-mail and any attachments
> thereto, is strictly prohibited. If you have received this in error, please immediately notify me and permanently
> delete the original and any printout thereof. E-mail transmission cannot be guaranteed to be secure or error-free.
> The sender therefore does not accept liability for any errors or omissions in the contents of this message which
> arise as a result of e-mail transmission.
> NOTICE REGARDING PRIVACY AND CONFIDENTIALITY
> Knight Capital Group may, at its discretion, monitor and review the content of all e-mail communications.
> 
> http://www.knight.com
> 
>         
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> http://gluster.org/cgi-bin/mailman/listinfo/gluster-users

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gluster.org/pipermail/gluster-users/attachments/20110610/94d5d87b/attachment-0001.htm>


[Index of Archives]     [Gluster Development]     [Linux Filesytems Development]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux