Re: Self-heal with partial files

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Krishna/Kevan,

I tested with the glusterfs--mainline--2.5--patch-515 and Self heal is
working fine without any hiccups.

regards,

On 10/6/07, Krishna Srinivas <krishna@xxxxxxxxxxxxx> wrote:
>
> Hi Kevan,
>
> This particular case is failing with older kernel+fuse versions which
> send mknod+open for create call. I will fix the issue and let you know.
>
> Thanks
> Krishna
>
> On 10/4/07, Kevan Benson <kbenson@xxxxxxxxxxxxxxx> wrote:
> > Krishna Srinivas wrote:
> > > On 10/4/07, Kevan Benson <kbenson@xxxxxxxxxxxxxxx> wrote:
> > >
> > >> Is self heal supposed to work with partial files?  I have an issue
> where
> > >> self-heal isn't happening on some servers with AFR and unify in a HA
> > >> setup I developed.  Two servers, two clients, all AFR and unify done
> on
> > >> client side.
> > >>
> > >> If I kill a connection while a large file is being written, the
> > >> glusterfs mount waits the appropriate timeout period (10 seconds in
> my
> > >> case) and then finishes writing the file to the still active server.
> > >> This results in a full file on one server and a partial file on the
> > >> other (the one I stopped traffic to temporarily to simulate a
> > >> crash/network problem).  If I then enable the disabled server and
> read
> > >> data from the problematic file, it doesn't self-heal  itself and move
> > >> the full file to the server with the partial file.
> > >>
> > >> Anything written entirely while a server is offline (i.e. the offline
> > >> server has no knowledge of it) is correctly created on read from the
> > >> file, so the problem seems to be related to files that are partially
> > >> written to one server.
> > >>
> > >> Can someone comment on the particular conditions that cause a self
> > >> heal?  Is there something I can do to force it to self heal at this
> > >> point (I repeat that reading data from the file does not work).  I
> know
> > >> I can use rsync and some foo to fix this, but that becomes less and
> less
> > >> feasible as the mount size grows and the time for rsync to compare
> sides
> > >> lengthens.
> > >>
> > >>
> > >> _______________________________________________
> > >> Gluster-devel mailing list
> > >> Gluster-devel@xxxxxxxxxx
> > >> http://lists.nongnu.org/mailman/listinfo/gluster-devel
> > >>
> > >>
> > >
> > > Hi Kevan,
> > >
> > > It should have worked fine in your case. What version of glusterfs are
> > > you using? Just before you do the second read (or open rather) which
> > > should have triggered self-heal can you do getfattr -n
> trusted.afr.version <>
> > > on the partial file and also the full file  in the backend and give
> the output?
> > >
> > > Thanks
> > > Krishna
> > >
> > >
> >
> > Glusterfs TLA 504, fuse-2.7.0-gfs4.
> >
> > The trusted.afr.version attribute doesn't exist on the partial file, it
> > does exist on the complete file (with value "1").  From what I just
> > tested, it doesn't look like it's set until the file operation is
> > complete (it doesn't exist during writing).  Are files without this
> > attribute assumed to have a value of "0" or something to ensure that
> > they participate in self-heal correctly?
> >
> > It doesn't look like it, as if I append data to the file, the partial
> > version gets assigned a trusted.afr.version=1, while the complete file's
> > trusted.afr.version is incremented to 2.  Self heal now works for that
> > file, and on a read of file data the partial file is updated with all
> > data and the trusted.afr.version is set to 2.
> >
> >
> > _______________________________________________
> > Gluster-devel mailing list
> > Gluster-devel@xxxxxxxxxx
> > http://lists.nongnu.org/mailman/listinfo/gluster-devel
> >
>
>
> _______________________________________________
> Gluster-devel mailing list
> Gluster-devel@xxxxxxxxxx
> http://lists.nongnu.org/mailman/listinfo/gluster-devel
>



-- 
Raghavendra G

A centipede was happy, until a toad in fun,
Said, "Prey, which leg comes after which?",
This raised his doubts to such a pitch,
He fell flat into the ditch,
Unable to know how to run.
-Anonymous


[Index of Archives]     [Gluster Users]     [Ceph Users]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Security]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux