Re: Hundreds of duplicate files

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Thanks Tom and Joe,
for the fast response!

Before I started my upgrade I stopped all clients using the volume and stopped all VM's with VHD on the volume, but I guess, and this may be the missing thing to reproduce this in a lab, I did not detach a NFS shared storage mount from a XenServer pool to this volume, since this is an extremely risky business. I also did not stop the volume. This I guess was a bit stupid, but since I did upgrades in the past this way without any issues I skipped this step (a really bad habit). I'll make amends and file a proper bug report :-). I agree with you Joe, this should never happen, even when someone ignores the advice of stopping the volume. If it would also be nessessary to detach shared storage NFS connections to a volume, than franky, glusterfs is unusable in a private cloud. No one can afford downtime of the whole infrastructure just for a glusterfs upgrade. Ideally a replicated gluster volume should even be able to remain online and used during (at least a minor version) upgrade.

I don't know whether a heal was maybe buzzy when I started the upgrade. I forgot to check. I did check the CPU activity on the gluster nodes which were very low (in the 0.0X range via top), so I doubt it. I will add this to the bug report as a suggestion should they not be able to reproduce with an open NFS connection.

By the way, is it sufficient to do:
service glusterd stop
service glusterfsd stop
and do a:
ps aux | gluster*
to see if everything has stopped and kill any leftovers should this be necessary?

For the fix, do you agree that if I run e.g.:
find /export/* -type f -size 0 -perm 1000 -exec /bin/rm {} \;
on every node if /export is the location of all my bricks, also in a replicated set-up, this will be save?
No necessary 0bit files will be deleted in e.g. the .glusterfs of every brick?

Thanks for your support!

Cheers,
Olav






On 18/02/15 20:51, Joe Julian wrote:

On 02/18/2015 11:43 AM, tbenzvi@xxxxxxxxxxxxxxx wrote:
Hi Olav,

I have a hunch that our problem was caused by improper unmounting of the gluster volume, and have since found that the proper order should be: kill all jobs using volume -> unmount volume on clients -> gluster volume stop -> stop gluster service (if necessary)
 
In my case, I wrote a Python script to find duplicate files on the mounted volume, then delete the corresponding link files on the bricks (making sure to also delete files in the .glusterfs directory)
 
However, your find command was also suggested to me and I think it's a simpler solution. I believe removing all link files (even ones that are not causing duplicates) is fine since the next file access gluster will do a lookup on all bricks and recreate any link files if necessary. Hopefully a gluster expert can chime in on this point as I'm not completely sure.

You are correct.

 
Keep in mind your setup is somewhat different than mine as I have only 5 bricks with no replication.
 
Regards,
Tom
 
--------- Original Message ---------
Subject: Re: Hundreds of duplicate files
From: "Olav Peeters" <opeeters@xxxxxxxxx>
Date: 2/18/15 10:52 am
To: gluster-users@xxxxxxxxxxx, tbenzvi@xxxxxxxxxxxxxxx

Hi all,
I'm have this problem after upgrading from 3.5.3 to 3.6.2.
At the moment I am still waiting for a heal to finish (on a 31TB volume with 42 bricks, replicated over three nodes).

Tom,
how did you remove the duplicates?
with 42 bricks I will not be able to do this manually..
Did a:
find $brick_root -type f -size 0 -perm 1000 -exec /bin/rm {} \;
work for you?

Should this type of thing ideally not be checked and mended by a heal?

Does anyone have an idea yet how this happens in the first place? Can it be connected to upgrading?

Cheers,
Olav
 
On 01/01/15 03:07, tbenzvi@xxxxxxxxxxxxxxx wrote:
No, the files can be read on a newly mounted client! I went ahead and deleted all of the link files associated with these duplicates, and then remounted the volume. The problem is fixed!
Thanks again for the help, Joe and Vijay.
 
Tom
 
--------- Original Message ---------
Subject: Re: Hundreds of duplicate files
From: "Vijay Bellur" <vbellur@xxxxxxxxxx>
Date: 12/28/14 3:23 am
To: tbenzvi@xxxxxxxxxxxxxxx, gluster-users@xxxxxxxxxxx

On 12/28/2014 01:20 PM, tbenzvi@xxxxxxxxxxxxxxx wrote:
> Hi Vijay,
> Yes the files are still readable from the .glusterfs path.
> There is no explicit error. However, trying to read a text file in
> python simply gives me null characters:
>
> >>> open('ott_mf_itab').readlines()
> ['\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00']
>
> And reading binary files does the same
>

Is this behavior seen with a freshly mounted client too?

-Vijay

> --------- Original Message ---------
> Subject: Re: Hundreds of duplicate files
> From: "Vijay Bellur" <vbellur@xxxxxxxxxx>
> Date: 12/27/14 9:57 pm
> To: tbenzvi@xxxxxxxxxxxxxxx, gluster-users@xxxxxxxxxxx
>
> On 12/28/2014 10:13 AM, tbenzvi@xxxxxxxxxxxxxxx wrote:
> > Thanks Joe, I've read your blog post as well as your post
> regarding the
> > .glusterfs directory.
> > I found some unneeded duplicate files which were not being read
> > properly. I then deleted the link file from the brick. This always
> > removes the duplicate file from the listing, but the file does not
> > always become readable. If I also delete the associated file in the
> > .glusterfs directory on that brick, then some more files become
> > readable. However this solution still doesn't work for all files.
> > I know the file on the brick is not corrupt as it can be read
> directly
> > from the brick directory.
>
> For files that are not readable from the client, can you check if the
> file is readable from the .glusterfs/ path?
>
> What is the specific error that is seen while trying to read one such
> file from the client?
>
> Thanks,
> Vijay
>
>
>
> _______________________________________________
> Gluster-users mailing list
> Gluster-users@xxxxxxxxxxx
> http://www.gluster.org/mailman/listinfo/gluster-users
>


_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://www.gluster.org/mailman/listinfo/gluster-users


_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://www.gluster.org/mailman/listinfo/gluster-users



_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://www.gluster.org/mailman/listinfo/gluster-users

_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://www.gluster.org/mailman/listinfo/gluster-users

[Index of Archives]     [Gluster Development]     [Linux Filesytems Development]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux