Re: Fwd: Replica brick not working

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



1. No, atm the old server (storage2) volume is mounted on some other servers, so all files are created there. If I check the new brick, there is no files.


2. On storage2 server (old brick)
getfattr: Removing leading '/' from absolute path names
# file: data/data-cluster
trusted.gfid=0x00000000000000000000000000000001
trusted.glusterfs.dht=0x000000010000000000000000ffffffff
trusted.glusterfs.volume-id=0x0226135726f346bcb3f8cb73365ed382

On storage server (new brick)
getfattr: Removing leading '/' from absolute path names
# file: data/data-cluster
trusted.gfid=0x00000000000000000000000000000001
trusted.glusterfs.dht=0x000000010000000000000000ffffffff
trusted.glusterfs.volume-id=0x0226135726f346bcb3f8cb73365ed382


3.
Thread 8 (Thread 0x7fad832dd700 (LWP 30057)):
#0 pthread_cond_timedwait@@GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:225 #1 0x00007fad88834f3e in __afr_shd_healer_wait () from /usr/lib/x86_64-linux-gnu/glusterfs/3.7.6/xlator/cluster/replicate.so #2 0x00007fad88834fad in afr_shd_healer_wait () from /usr/lib/x86_64-linux-gnu/glusterfs/3.7.6/xlator/cluster/replicate.so #3 0x00007fad88835aa0 in afr_shd_index_healer () from /usr/lib/x86_64-linux-gnu/glusterfs/3.7.6/xlator/cluster/replicate.so #4 0x00007fad8df4270a in start_thread (arg=0x7fad832dd700) at pthread_create.c:333 #5 0x00007fad8dc7882d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109

Thread 7 (Thread 0x7fad83ade700 (LWP 30056)):
#0 0x00007fad8dc78e23 in epoll_wait () at ../sysdeps/unix/syscall-template.S:84 #1 0x00007fad8e808a58 in ?? () from /usr/lib/x86_64-linux-gnu/libglusterfs.so.0 #2 0x00007fad8df4270a in start_thread (arg=0x7fad83ade700) at pthread_create.c:333 #3 0x00007fad8dc7882d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109

Thread 6 (Thread 0x7fad894a5700 (LWP 30055)):
#0 0x00007fad8dc78e23 in epoll_wait () at ../sysdeps/unix/syscall-template.S:84 #1 0x00007fad8e808a58 in ?? () from /usr/lib/x86_64-linux-gnu/libglusterfs.so.0 #2 0x00007fad8df4270a in start_thread (arg=0x7fad894a5700) at pthread_create.c:333 #3 0x00007fad8dc7882d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109

Thread 5 (Thread 0x7fad8a342700 (LWP 30054)):
#0 pthread_cond_timedwait@@GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:225 #1 0x00007fad8e7ecd98 in syncenv_task () from /usr/lib/x86_64-linux-gnu/libglusterfs.so.0 #2 0x00007fad8e7ed970 in syncenv_processor () from /usr/lib/x86_64-linux-gnu/libglusterfs.so.0 #3 0x00007fad8df4270a in start_thread (arg=0x7fad8a342700) at pthread_create.c:333 #4 0x00007fad8dc7882d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109

Thread 4 (Thread 0x7fad8ab43700 (LWP 30053)):
#0 pthread_cond_timedwait@@GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:225 #1 0x00007fad8e7ecd98 in syncenv_task () from /usr/lib/x86_64-linux-gnu/libglusterfs.so.0 #2 0x00007fad8e7ed970 in syncenv_processor () from /usr/lib/x86_64-linux-gnu/libglusterfs.so.0 #3 0x00007fad8df4270a in start_thread (arg=0x7fad8ab43700) at pthread_create.c:333 #4 0x00007fad8dc7882d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109

Thread 3 (Thread 0x7fad8b344700 (LWP 30052)):
#0 do_sigwait (sig=0x7fad8b343e3c, set=<optimized out>) at ../sysdeps/unix/sysv/linux/sigwait.c:64 #1 __sigwait (set=<optimized out>, sig=0x7fad8b343e3c) at ../sysdeps/unix/sysv/linux/sigwait.c:96
#2  0x00000000004080bf in glusterfs_sigwaiter ()
#3 0x00007fad8df4270a in start_thread (arg=0x7fad8b344700) at pthread_create.c:333 #4 0x00007fad8dc7882d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109

Thread 2 (Thread 0x7fad8bb45700 (LWP 30051)):
#0 0x00007fad8df4bc6d in nanosleep () at ../sysdeps/unix/syscall-template.S:84 #1 0x00007fad8e7ca744 in gf_timer_proc () from /usr/lib/x86_64-linux-gnu/libglusterfs.so.0 #2 0x00007fad8df4270a in start_thread (arg=0x7fad8bb45700) at pthread_create.c:333 #3 0x00007fad8dc7882d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109

Thread 1 (Thread 0x7fad8ec66780 (LWP 30050)):
#0 0x00007fad8df439dd in pthread_join (threadid=140383309420288, thread_return=0x0) at pthread_join.c:90 #1 0x00007fad8e808eeb in ?? () from /usr/lib/x86_64-linux-gnu/libglusterfs.so.0
#2  0x0000000000405501 in main ()


- Kindest regards,

Milos Cuculovic
IT Manager

---
MDPI AG
Postfach, CH-4020 Basel, Switzerland
Office: St. Alban-Anlage 66, 4052 Basel, Switzerland
Tel. +41 61 683 77 35
Fax +41 61 302 89 18
Email: cuculovic@xxxxxxxx
Skype: milos.cuculovic.mdpi

On 08.12.2016 16:17, Ravishankar N wrote:
On 12/08/2016 06:53 PM, Atin Mukherjee wrote:


On Thu, Dec 8, 2016 at 6:44 PM, Miloš Čučulović - MDPI
<cuculovic@xxxxxxxx <mailto:cuculovic@xxxxxxxx>> wrote:

    Ah, damn! I found the issue. On the storage server, the storage2
    IP address was wrong, I inversed two digits in the /etc/hosts
    file, sorry for that :(

    I was able to add the brick now, I started the heal, but still no
    data transfer visible.

1. Are the files getting created on the new brick though?
2. Can you provide the output of `getfattr -d -m . -e hex
/data/data-cluster` on both bricks?
3. Is it possible to attach gdb to the self-heal daemon on the original
(old) brick and get a backtrace?
    `gdb -p <pid of self-heal daemon on the orignal brick>`
     thread apply all bt  -->share this output
    quit gdb.


-Ravi

@Ravi/Pranith - can you help here?



    By doing gluster volume status, I have

    Status of volume: storage
    Gluster process                       TCP Port  RDMA Port  Online  Pid
    ------------------------------------------------------------------------------
    Brick storage2:/data/data-cluster     49152     0          Y
     23101
    Brick storage:/data/data-cluster      49152     0          Y
     30773
    Self-heal Daemon on localhost         N/A       N/A        Y
     30050
    Self-heal Daemon on storage           N/A       N/A        Y
     30792


    Any idea?

    On storage I have:
    Number of Peers: 1

    Hostname: 195.65.194.217
    Uuid: 7c988af2-9f76-4843-8e6f-d94866d57bb0
    State: Peer in Cluster (Connected)


    - Kindest regards,

    Milos Cuculovic
    IT Manager

    ---
    MDPI AG
    Postfach, CH-4020 Basel, Switzerland
    Office: St. Alban-Anlage 66, 4052 Basel, Switzerland
    Tel. +41 61 683 77 35
    Fax +41 61 302 89 18
    Email: cuculovic@xxxxxxxx <mailto:cuculovic@xxxxxxxx>
    Skype: milos.cuculovic.mdpi

    On 08.12.2016 13:55, Atin Mukherjee wrote:

        Can you resend the attachment as zip? I am unable to extract the
        content? We shouldn't have 0 info file. What does gluster peer
        status
        output say?

        On Thu, Dec 8, 2016 at 4:51 PM, Miloš Čučulović - MDPI
        <cuculovic@xxxxxxxx <mailto:cuculovic@xxxxxxxx>
        <mailto:cuculovic@xxxxxxxx <mailto:cuculovic@xxxxxxxx>>> wrote:

            I hope you received my last email Atin, thank you!

            - Kindest regards,

            Milos Cuculovic
            IT Manager

            ---
            MDPI AG
            Postfach, CH-4020 Basel, Switzerland
            Office: St. Alban-Anlage 66, 4052 Basel, Switzerland
            Tel. +41 61 683 77 35
            Fax +41 61 302 89 18
            Email: cuculovic@xxxxxxxx <mailto:cuculovic@xxxxxxxx>
        <mailto:cuculovic@xxxxxxxx <mailto:cuculovic@xxxxxxxx>>
            Skype: milos.cuculovic.mdpi

            On 08.12.2016 10:28, Atin Mukherjee wrote:


                ---------- Forwarded message ----------
                From: *Atin Mukherjee* <amukherj@xxxxxxxxxx
        <mailto:amukherj@xxxxxxxxxx>
                <mailto:amukherj@xxxxxxxxxx
        <mailto:amukherj@xxxxxxxxxx>> <mailto:amukherj@xxxxxxxxxx
        <mailto:amukherj@xxxxxxxxxx>
                <mailto:amukherj@xxxxxxxxxx
        <mailto:amukherj@xxxxxxxxxx>>>>
                Date: Thu, Dec 8, 2016 at 11:56 AM
                Subject: Re:  Replica brick not working
                To: Ravishankar N <ravishankar@xxxxxxxxxx
        <mailto:ravishankar@xxxxxxxxxx>
                <mailto:ravishankar@xxxxxxxxxx
        <mailto:ravishankar@xxxxxxxxxx>>
        <mailto:ravishankar@xxxxxxxxxx <mailto:ravishankar@xxxxxxxxxx>
                <mailto:ravishankar@xxxxxxxxxx
        <mailto:ravishankar@xxxxxxxxxx>>>>
                Cc: Miloš Čučulović - MDPI <cuculovic@xxxxxxxx
        <mailto:cuculovic@xxxxxxxx>
                <mailto:cuculovic@xxxxxxxx <mailto:cuculovic@xxxxxxxx>>
                <mailto:cuculovic@xxxxxxxx <mailto:cuculovic@xxxxxxxx>
        <mailto:cuculovic@xxxxxxxx <mailto:cuculovic@xxxxxxxx>>>>,
                Pranith Kumar Karampuri
                <pkarampu@xxxxxxxxxx <mailto:pkarampu@xxxxxxxxxx>
        <mailto:pkarampu@xxxxxxxxxx <mailto:pkarampu@xxxxxxxxxx>>
                <mailto:pkarampu@xxxxxxxxxx
        <mailto:pkarampu@xxxxxxxxxx> <mailto:pkarampu@xxxxxxxxxx
        <mailto:pkarampu@xxxxxxxxxx>>>>,
                gluster-users
                <gluster-users@xxxxxxxxxxx
        <mailto:gluster-users@xxxxxxxxxxx>
        <mailto:gluster-users@xxxxxxxxxxx
        <mailto:gluster-users@xxxxxxxxxxx>>
                <mailto:gluster-users@xxxxxxxxxxx
        <mailto:gluster-users@xxxxxxxxxxx>
                <mailto:gluster-users@xxxxxxxxxxx
        <mailto:gluster-users@xxxxxxxxxxx>>>>




                On Thu, Dec 8, 2016 at 11:11 AM, Ravishankar N
                <ravishankar@xxxxxxxxxx
        <mailto:ravishankar@xxxxxxxxxx> <mailto:ravishankar@xxxxxxxxxx
        <mailto:ravishankar@xxxxxxxxxx>>
                <mailto:ravishankar@xxxxxxxxxx
        <mailto:ravishankar@xxxxxxxxxx> <mailto:ravishankar@xxxxxxxxxx
        <mailto:ravishankar@xxxxxxxxxx>>>>

                wrote:

                    On 12/08/2016 10:43 AM, Atin Mukherjee wrote:

                        >From the log snippet:

                        [2016-12-07 09:15:35.677645] I [MSGID: 106482]

        [glusterd-brick-ops.c:442:__glusterd_handle_add_brick]
                        0-management: Received add brick req
                        [2016-12-07 09:15:35.677708] I [MSGID: 106062]

        [glusterd-brick-ops.c:494:__glusterd_handle_add_brick]
                        0-management: replica-count is 2
                        [2016-12-07 09:15:35.677735] E [MSGID: 106291]

        [glusterd-brick-ops.c:614:__glusterd_handle_add_brick]
                0-management:

                        The last log entry indicates that we hit the
        code path in
                        gd_addbr_validate_replica_count ()

                                        if (replica_count ==
                volinfo->replica_count) {
                                                if (!(total_bricks %
                        volinfo->dist_leaf_count)) {
                                                        ret = 1;
                                                        goto out;
                        }
                                        }


                    It seems unlikely that this snippet was hit
        because we print
                the E
                    [MSGID: 106291] in the above message only if ret==-1.
                    gd_addbr_validate_replica_count() returns -1 and
        yet not
                populates
                    err_str only when in volinfo->type doesn't match
        any of the
                known
                    volume types, so volinfo->type is corrupted perhaps?


                You are right, I missed that ret is set to 1 here in
        the above
                snippet.

                @Milos - Can you please provide us the volume info
        file from
                /var/lib/glusterd/vols/<volname>/ from all the three
        nodes to
                continue
                the analysis?



                    -Ravi

                        @Pranith, Ravi - Milos was trying to convert a
        dist (1 X 1)
                        volume to a replicate (1 X 2) using add brick
        and hit
                this issue
                        where add-brick failed. The cluster is
        operating with 3.7.6.
                        Could you help on what scenario this code path
        can be
                hit? One
                        straight forward issue I see here is missing
        err_str in
                this path.






                --

                ~ Atin (atinm)



                --

                ~ Atin (atinm)




        --

        ~ Atin (atinm)




--

~ Atin (atinm)


_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://www.gluster.org/mailman/listinfo/gluster-users




[Index of Archives]     [Gluster Development]     [Linux Filesytems Development]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux