Re: Choice of Translator question

"Krishna Srinivas" <krishna@xxxxxxxxxxxxx> · Sat, 29 Dec 2007 01:04:33 +0530

On Dec 29, 2007 12:08 AM, Gareth Bult <gareth@xxxxxxxxxxxxx> wrote:
> Ok, a little more testing .. wait, it gets better!
>
> I now have a x10 stripe.
>
> 10 stripes on node a - 10 AFR's          - 1 x10 Stripe
> 10 stripes on node b - (for self heal)     (to divide the heal chunk size)
>
> Sample;
>
> a. create 500M file
> b. take down one glusterfsd process
> c. append 2M to file
> d. bring glusterfsd back up
> e. head -c1 on file
>
> Problem #1;
>
> On a self-heal, it does a self-heal on every stripe, regardless of the fact I only appended 2M.

It is a bug then. Only 2 afrs should have done self heal, the ones where
two chunks of 1MB of 2MBs were written to. How did you confirm all the
10 afrs did selfheal?

>
> Problem #2;
>
> Self-heal ignored the fact the file is sparse and copies the entire sripe ..
> so for a 500M file, the healing process actually copies 5Gb!!

I need to check how it behaves on holes as no one had complained on this bug
before.

Thanks
Krishna

>
> Help! Bug!
>
> Here's a client config summary; (server condig is fairly obvious)
> ...
> volume stripes-stripe
>         type cluster/stripe
>         subvolumes afr-1 afr-2 afr-3 afr-4 afr-5 afr-6 afr-7 afr-8 afr-9 afr-10
>         option block-size *:1MB
> end-volume
> ...
> volume afr-1
>         type cluster/afr
>         subvolumes node1A node1B
>         option replicate *:2
>         option scheduler rr
> end-volume
> ...
> volume node1A
>         type protocol/client
>         option transport-type tcp/client
>         option remote-host nodea
>         option remote-subvolume stripes-1A
> end-volume
> volume node1B
>         type protocol/client
>         option transport-type tcp/client
>         option remote-host nodez
>         option remote-subvolume stripes-1B
> end-volume
>
>
>
>
>
> ----- Original Message -----
> From: "Csibra Gergo" <gergo@xxxxxxxxx>
> To: "Gareth Bult" <gareth@xxxxxxxxxxxxx>
> Sent: Friday, December 28, 2007 3:47:03 PM (GMT) Europe/London
> Subject: Re: Choice of Translator question
>
> Friday, December 28, 2007, 3:57:52 PM, Gareth Bult wrote:
>
> >>Oh. I don't understand this, can you explain why need to change configs regularly?
>
> > To add new systems,
>
> Add new systems to...? To the server? Adding new clients?
>
> > install newer versions of fuse and glusterfs,
>
> This and the upper is not a reason to rebuild/remirror afr-ed files.
> If a new version of glusterfs or fuse comes out or need to add new
> systems to anywhere in glusterfs system, you make a shutdown (unmount
> , kill glusterfsd (this make actually a regular sgutdown)) install new
> version and restart the glusterfs. The xattrs (they contains version
> informations) of mirrored files will be the same before and after
> version changes, so they will not remirrored.
>
> > to recover from fuse kernel lockups ..
>
> Yes. If this happens in write, yes. In this situation AFR-ed files
> need to be healed.
>
> > .. I've not yet seen raid/afr comments from anyone who actually
> > understands the problem, so I'm not likely to see a fix (?!)
>
> I understand this problem, but this because of bugs. Peoples in Z
> research working hard to fix them, and you gives lot information if
> you gives bugreports :)
>
>
> --
> Best regards,
>  Csibra Gergo                            mailto:gergo@xxxxxxxxx
>
>
>
>
> _______________________________________________
> Gluster-devel mailing list
> Gluster-devel@xxxxxxxxxx
> http://lists.nongnu.org/mailman/listinfo/gluster-devel
>