Re: AFR problem with 2.0rc4

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



More ,
In fact, I cannot say that's self healing work very well.
I'm trying to cp some big file ( 10g ) . fewer is ok, but all of them
are partially copied
so on first server my file is 10g  ( ok ) but in the second = 8g : not
ok   and synchro does not occur.
I'm trying to umount without success, i 'm trying to set option
favorite child to first server and umount, without success, and i 'im
trying to delete on storage of server two , same problem, and this
file is never seen now.

So , i change the config client :

subvolumes brick_10.98.98.2 brick_10.98.98.1   =>  subvolumes
brick_10.98.98.1 brick_10.98.98.2

and now, client mount point see file ( that i remove on server two ) ,
then self healing can begin.
( but the size in storage of server two keep 0 size , just the file is
create .... )

So we say it seems have to be a lot of weird problem, first at all,
why order in subvolumes is so important ?

Regards,
Nicolas Prochazka



On Thu, Mar 19, 2009 at 9:58 AM, nicolas prochazka
<prochazka.nicolas@xxxxxxxxx> wrote:
> I'm trying last gluster from git,
> bug is corrected, but there's seem to be a lot of weird comportment in AFR mode.
> If i down one of two server, clients does not respond to a ls, or
> respond but with not all file, just one....
> I'm trying with and without lock server to 2 , 1 or 0  , results are the same.
>
> Regards,
> Nicolas Prochazka
>
> On Wed, Mar 18, 2009 at 9:33 AM, Amar Tumballi <amar@xxxxxxxxxxx> wrote:
>> Hi Nicolas,
>>  Sure, We are in the process of internal testing. It should be out as
>> release soon. Meanwhile, you can pull from git and test it out.
>>
>> Regards,
>>
>> On Wed, Mar 18, 2009 at 1:30 AM, nicolas prochazka
>> <prochazka.nicolas@xxxxxxxxx> wrote:
>>>
>>> Hello,
>>> I see in git tree correction of afr heal bug ,
>>> can wa test this release, is stable enough in compare rc release ?
>>> nicolas
>>>
>>> On Tue, Mar 17, 2009 at 9:39 PM, nicolas prochazka
>>> <prochazka.nicolas@xxxxxxxxx> wrote:
>>> > My test is :
>>> > Set two server in AFR mode
>>> > copy file to mount point ( /mnt/vdisk ) :  ok  , synchro is ok on two
>>> > server.
>>> > Then delete (rm ) all file from storage on server 1 ( /mnt/disks/export
>>> > )
>>> > then wait for synchronisation.
>>> > with rc2 and rc4  => file with good size ( ls -l) but nothing here (
>>> > df -b shows no disk usage ) and files are corrupt
>>> > with rc1 : all is ok, server resynchro perfectly., i think is the right
>>> > way ;)
>>> >
>>> > nicoals
>>> >
>>> > On Tue, Mar 17, 2009 at 6:49 PM, Amar Tumballi <amar@xxxxxxxxxxx> wrote:
>>> >> Hi Nicolas,
>>> >>  When you mean you 'add' a server here, you are adding another server
>>> >> to
>>> >> replicate subvolume? (ie, 2 to 3), or you had one server down when
>>> >> copying
>>> >> data (of 2 servers), and you bring back another server up and trigger
>>> >> the
>>> >> afr self heal ?
>>> >>
>>> >> Regards,
>>> >> Amar
>>> >>
>>> >> On Tue, Mar 17, 2009 at 7:22 AM, nicolas prochazka
>>> >> <prochazka.nicolas@xxxxxxxxx> wrote:
>>> >>>
>>> >>> Yes i'm trying  without any translator but bugs persists.
>>> >>>
>>> >>> Into logs i can not see anything interesting, size of file seems to be
>>> >>> always ok when it begin synchronize.
>>> >>> As i write before, if i cp files during normal operation ( 2 servers
>>> >>> ok ) all is ok, problem appears only when i try to resynchronize ( rm
>>> >>> all on one of server ( in storage/posix) directory, gluster recreate
>>> >>> file but empty or with buggy data.
>>> >>>
>>> >>> I notice too, that with RC1, during resynchronise, if i try an ls on
>>> >>> mount point, ls is blocking until synchronisation is ending, with RC2,
>>> >>> ls is not blocking.
>>> >>>
>>> >>> Regards,
>>> >>> Nicolas
>>> >>>
>>> >>>
>>> >>>
>>> >>>
>>> >>> On Tue, Mar 17, 2009 at 2:50 PM, Gordan Bobic <gordan@xxxxxxxxxx>
>>> >>> wrote:
>>> >>> > Have you tried the later versions (rc2/rc4) without the performance
>>> >>> > trasnlators? Does the problem persist without them? Anything
>>> >>> > interesting
>>> >>> > looking in the logs?
>>> >>> >
>>> >>> > On Tue, 17 Mar 2009 14:46:41 +0100, nicolas prochazka
>>> >>> > <prochazka.nicolas@xxxxxxxxx> wrote:
>>> >>> >> hello again,
>>> >>> >> So this bug does not occur with RC1
>>> >>> >>
>>> >>> >> RC2,RC4 contains bug describe below, not RC1 , any idea ?
>>> >>> >> Nicolas
>>> >>> >>
>>> >>> >> On Tue, Mar 17, 2009 at 12:55 PM, nicolas prochazka
>>> >>> >> <prochazka.nicolas@xxxxxxxxx> wrote:
>>> >>> >>> I 'm just trying with rc2 , same bug as rc4.
>>> >>> >>> Regards,
>>> >>> >>> Nicolas
>>> >>> >>>
>>> >>> >>> On Tue, Mar 17, 2009 at 12:06 PM, Gordan Bobic <gordan@xxxxxxxxxx>
>>> >>> > wrote:
>>> >>> >>>> Can you check if it works correctly with 2.0rc2 and/or 2.0rc1?
>>> >>> >>>>
>>> >>> >>>> On Tue, 17 Mar 2009 12:04:33 +0100, nicolas prochazka
>>> >>> >>>> <prochazka.nicolas@xxxxxxxxx> wrote:
>>> >>> >>>>> oups,
>>> >>> >>>>> same problem in fact with simple 8 bytes text file, the file
>>> >>> >>>>> seems
>>> >>> >>>>> to
>>> >>> >>>>> be corrupt.
>>> >>> >>>>>
>>> >>> >>>>> Regards,
>>> >>> >>>>> Nicolas Prochazka
>>> >>> >>>>>
>>> >>> >>>>> On Tue, Mar 17, 2009 at 11:20 AM, Gordan Bobic
>>> >>> >>>>> <gordan@xxxxxxxxxx>
>>> >>> >>>>> wrote:
>>> >>> >>>>>> Are you sure this is rc4 specific? I've seen assorted weirdness
>>> >>> >>>>>> when
>>> >>> >>>>>> adding
>>> >>> >>>>>> and removing servers in all versions up to and including rc2
>>> >>> >>>>>> (rc4
>>> >>> >>>>>> seems
>>> >>> >>>>>> to
>>> >>> >>>>>> lock up when starting udev on it, so I'm not using it).
>>> >>> >>>>>>
>>> >>> >>>>>> On Tue, 17 Mar 2009 11:15:30 +0100, nicolas prochazka
>>> >>> >>>>>> <prochazka.nicolas@xxxxxxxxx> wrote:
>>> >>> >>>>>>> Hello guys,
>>> >>> >>>>>>>
>>> >>> >>>>>>> strange problem :
>>> >>> >>>>>>> with rc4, afr synchronisation seems to be not work :
>>> >>> >>>>>>> - If i copy a file on mount gluster, all is ok on all servers
>>> >>> >>>>>>> - if i add a new server in gluster, this server create my
>>> >>> >>>>>>> files (
>>> >>> > 10G
>>> >>> >>>>>>> size ) , it's appear on XFS as 10G file but file does not
>>> >>> >>>>>>> contains
>>> >>> >>>>>>> original, just some octets,
>>> >>> >>>>>>> then gluster do not synchronise, perhaps because the size is
>>> >>> >>>>>>> same.
>>> >>> >>>>>>>
>>> >>> >>>>>>> regards,
>>> >>> >>>>>>> NP
>>> >>> >>>>>>>
>>> >>> >>>>>>>
>>> >>> >>>>>>> volume brickless
>>> >>> >>>>>>> type storage/posix
>>> >>> >>>>>>> option directory /mnt/disks/export
>>> >>> >>>>>>> end-volume
>>> >>> >>>>>>>
>>> >>> >>>>>>> volume brickthread
>>> >>> >>>>>>> type features/posix-locks
>>> >>> >>>>>>> option mandatory-locks on          # enables mandatory locking
>>> >>> >>>>>>> on
>>> >>> >>>>>>> all
>>> >>> >>>>>> files
>>> >>> >>>>>>> subvolumes brickless
>>> >>> >>>>>>> end-volume
>>> >>> >>>>>>>
>>> >>> >>>>>>> volume brick
>>> >>> >>>>>>> type performance/io-threads
>>> >>> >>>>>>> option thread-count 4
>>> >>> >>>>>>> subvolumes brickthread
>>> >>> >>>>>>> end-volume
>>> >>> >>>>>>>
>>> >>> >>>>>>>
>>> >>> >>>>>>> volume server
>>> >>> >>>>>>> type protocol/server
>>> >>> >>>>>>> subvolumes brick
>>> >>> >>>>>>> option transport-type tcp
>>> >>> >>>>>>> option auth.addr.brick.allow 10.98.98.*
>>> >>> >>>>>>> end-volume
>>> >>> >>>>>>>
>>> >>> >>>>>>>
>>> >>> >>>>>>>
>>> >>> >>>>>>> -------------------------------------------
>>> >>> >>>>>>>
>>> >>> >>>>>>>
>>> >>> >>>>>>>
>>> >>> >>>>>>> volume brick_10.98.98.1
>>> >>> >>>>>>> type protocol/client
>>> >>> >>>>>>> option transport-type tcp/client
>>> >>> >>>>>>> option transport-timeout 120
>>> >>> >>>>>>> option remote-host 10.98.98.1
>>> >>> >>>>>>> option remote-subvolume brick
>>> >>> >>>>>>> end-volume
>>> >>> >>>>>>>
>>> >>> >>>>>>>
>>> >>> >>>>>>> volume brick_10.98.98.2
>>> >>> >>>>>>> type protocol/client
>>> >>> >>>>>>> option transport-type tcp/client
>>> >>> >>>>>>> option transport-timeout 120
>>> >>> >>>>>>> option remote-host 10.98.98.2
>>> >>> >>>>>>> option remote-subvolume brick
>>> >>> >>>>>>> end-volume
>>> >>> >>>>>>>
>>> >>> >>>>>>>
>>> >>> >>>>>>> volume last
>>> >>> >>>>>>> type cluster/replicate
>>> >>> >>>>>>> subvolumes brick_10.98.98.1 brick_10.98.98.2
>>> >>> >>>>>>> option read-subvolume brick_10.98.98.1
>>> >>> >>>>>>> option favorite-child brick_10.98.98.1
>>> >>> >>>>>>> end-volume
>>> >>> >>>>>>> volume iothreads
>>> >>> >>>>>>> type performance/io-threads
>>> >>> >>>>>>> option thread-count 4
>>> >>> >>>>>>> subvolumes last
>>> >>> >>>>>>> end-volume
>>> >>> >>>>>>>
>>> >>> >>>>>>> volume io-cache
>>> >>> >>>>>>> type performance/io-cache
>>> >>> >>>>>>> option cache-size 2048MB             # default is 32MB
>>> >>> >>>>>>> option page-size  128KB             #128KB is default option
>>> >>> >>>>>>> option cache-timeout 2  # default is 1
>>> >>> >>>>>>> subvolumes iothreads
>>> >>> >>>>>>> end-volume
>>> >>> >>>>>>>
>>> >>> >>>>>>> volume writebehind
>>> >>> >>>>>>> type performance/write-behind
>>> >>> >>>>>>> option aggregate-size 128KB # default is 0bytes
>>> >>> >>>>>>> option window-size 512KB
>>> >>> >>>>>>> option flush-behind off      # default is 'off'
>>> >>> >>>>>>> subvolumes io-cache
>>> >>> >>>>>>> end-volume
>>> >>> >>>>>>>
>>> >>> >>>>>>>
>>> >>> >>>>>>> _______________________________________________
>>> >>> >>>>>>> Gluster-devel mailing list
>>> >>> >>>>>>> Gluster-devel@xxxxxxxxxx
>>> >>> >>>>>>> http://lists.nongnu.org/mailman/listinfo/gluster-devel
>>> >>> >>>>>>
>>> >>> >>>>>>
>>> >>> >>>>>> _______________________________________________
>>> >>> >>>>>> Gluster-devel mailing list
>>> >>> >>>>>> Gluster-devel@xxxxxxxxxx
>>> >>> >>>>>> http://lists.nongnu.org/mailman/listinfo/gluster-devel
>>> >>> >>>>>>
>>> >>> >>>>
>>> >>> >>>>
>>> >>> >>>> _______________________________________________
>>> >>> >>>> Gluster-devel mailing list
>>> >>> >>>> Gluster-devel@xxxxxxxxxx
>>> >>> >>>> http://lists.nongnu.org/mailman/listinfo/gluster-devel
>>> >>> >>>>
>>> >>> >>>
>>> >>> >
>>> >>> >
>>> >>> > _______________________________________________
>>> >>> > Gluster-devel mailing list
>>> >>> > Gluster-devel@xxxxxxxxxx
>>> >>> > http://lists.nongnu.org/mailman/listinfo/gluster-devel
>>> >>> >
>>> >>>
>>> >>>
>>> >>> _______________________________________________
>>> >>> Gluster-devel mailing list
>>> >>> Gluster-devel@xxxxxxxxxx
>>> >>> http://lists.nongnu.org/mailman/listinfo/gluster-devel
>>> >>
>>> >>
>>> >>
>>> >> --
>>> >> Amar Tumballi
>>> >>
>>> >>
>>> >
>>>
>>
>>
>>
>> --
>> Amar Tumballi
>>
>>
>




[Index of Archives]     [Gluster Users]     [Ceph Users]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Security]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux