Re: Gluter 3.12.12: performance during heal and in general

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Aug 23, 2018 at 5:28 PM, Pranith Kumar Karampuri <pkarampu@xxxxxxxxxx> wrote:
On Wed, Aug 22, 2018 at 12:01 PM Hu Bert <revirii@xxxxxxxxxxxxxx> wrote:
Just an addition: in general there are no log messages in
/var/log/glusterfs/ (if you don't all 'gluster volume ...'), but on
the node with the lowest load i see in cli.log.1:

[2018-08-22 06:20:43.291055] I [socket.c:2474:socket_event_handler]
0-transport: EPOLLERR - disconnecting now
[2018-08-22 06:20:46.291327] I [socket.c:2474:socket_event_handler]
0-transport: EPOLLERR - disconnecting now
[2018-08-22 06:20:49.291575] I [socket.c:2474:socket_event_handler]
0-transport: EPOLLERR - disconnecting now

every 3 seconds. Looks like this bug:
https://bugzilla.redhat.com/show_bug.cgi?id=1484885 - but that shoud
have been fixed in the 3.12.x release, and network is fine.

That's odd. Presuming cli.log.1 is the logrotated file, it should be showing older log entries than cli.log. But its not the case here.
Or maybe, there's something running on the command-line on the node with the lowest load.



In cli.log there are only these entries:

[2018-08-22 06:19:23.428520] I [cli.c:765:main] 0-cli: Started running
gluster with version 3.12.12
[2018-08-22 06:19:23.800895] I [MSGID: 101190]
[event-epoll.c:613:event_dispatch_epoll_worker] 0-epoll: Started
thread with index 1
[2018-08-22 06:19:23.800978] I [socket.c:2474:socket_event_handler]
0-transport: EPOLLERR - disconnecting now
[2018-08-22 06:19:23.809366] I [input.c:31:cli_batch] 0-: Exiting with: 0

Just wondered if this could related anyhow.

2018-08-21 8:17 GMT+02:00 Pranith Kumar Karampuri <pkarampu@xxxxxxxxxx>:
>
>
> On Tue, Aug 21, 2018 at 11:40 AM Hu Bert <revirii@xxxxxxxxxxxxxx> wrote:
>>
>> Good morning :-)
>>
>> gluster11:
>> ls -l /gluster/bricksdd1/shared/.glusterfs/indices/xattrop/
>> total 0
>> ---------- 1 root root 0 Aug 14 06:14
>> xattrop-006b65d8-9e81-4886-b380-89168ea079bd
>>
>> gluster12:
>> ls -l /gluster/bricksdd1_new/shared/.glusterfs/indices/xattrop/
>> total 0
>> ---------- 1 root root 0 Jul 17 11:24
>> xattrop-c7c6f765-ce17-4361-95fb-2fd7f31c7b82
>>
>> gluster13:
>> ls -l /gluster/bricksdd1_new/shared/.glusterfs/indices/xattrop/
>> total 0
>> ---------- 1 root root 0 Aug 16 07:54
>> xattrop-16b696a0-4214-4999-b277-0917c76c983e
>>
>>
>> And here's the output of 'perf ...' which ran almost a minute - file
>> grew pretty fast to a size of 17 GB and system load went up heavily.
>> Had to wait a while until load dropped :-)
>>
>> fyi - load at the moment:
>> load gluster11: ~90
>> load gluster12: ~10
>> load gluster13: ~50
>>
>> perf record --call-graph=dwarf -p 7897 -o
>> /tmp/perf.gluster11.bricksdd1.out
>> [ perf record: Woken up 9837 times to write data ]
>> Warning:
>> Processed 2137218 events and lost 33446 chunks!
>>
>> Check IO/CPU overload!
>>
>> [ perf record: Captured and wrote 16576.374 MB
>> /tmp/perf.gluster11.bricksdd1.out (2047760 samples) ]
>>
>> Here's an excerpt.
>>
>> +    1.93%     0.00%  glusteriotwr0    [unknown]              [k]
>> 0xffffffffffffffff
>> +    1.89%     0.00%  glusteriotwr28   [unknown]              [k]
>> 0xffffffffffffffff
>> +    1.86%     0.00%  glusteriotwr15   [unknown]              [k]
>> 0xffffffffffffffff
>> +    1.85%     0.00%  glusteriotwr63   [unknown]              [k]
>> 0xffffffffffffffff
>> +    1.83%     0.01%  glusteriotwr0    [kernel.kallsyms]      [k]
>> entry_SYSCALL_64_after_swapgs
>> +    1.82%     0.00%  glusteriotwr38   [unknown]              [k]
>> 0xffffffffffffffff
>> +    1.82%     0.01%  glusteriotwr28   [kernel.kallsyms]      [k]
>> entry_SYSCALL_64_after_swapgs
>> +    1.82%     0.00%  glusteriotwr0    [kernel.kallsyms]      [k]
>> do_syscall_64
>> +    1.81%     0.00%  glusteriotwr28   [kernel.kallsyms]      [k]
>> do_syscall_64
>> +    1.81%     0.00%  glusteriotwr15   [kernel.kallsyms]      [k]
>> entry_SYSCALL_64_after_swapgs
>> +    1.81%     0.00%  glusteriotwr36   [unknown]              [k]
>> 0xffffffffffffffff
>> +    1.80%     0.00%  glusteriotwr15   [kernel.kallsyms]      [k]
>> do_syscall_64
>> +    1.78%     0.01%  glusteriotwr63   [kernel.kallsyms]      [k]
>> entry_SYSCALL_64_after_swapgs
>> +    1.77%     0.00%  glusteriotwr63   [kernel.kallsyms]      [k]
>> do_syscall_64
>> +    1.75%     0.01%  glusteriotwr38   [kernel.kallsyms]      [k]
>> entry_SYSCALL_64_after_swapgs
>> +    1.75%     0.00%  glusteriotwr38   [kernel.kallsyms]      [k]
>> do_syscall_64
>> +    1.74%     0.00%  glusteriotwr17   [unknown]              [k]
>> 0xffffffffffffffff
>> +    1.74%     0.00%  glusteriotwr44   [unknown]              [k]
>> 0xffffffffffffffff
>> +    1.73%     0.00%  glusteriotwr6    [unknown]              [k]
>> 0xffffffffffffffff
>> +    1.73%     0.00%  glusteriotwr37   [unknown]              [k]
>> 0xffffffffffffffff
>> +    1.73%     0.01%  glusteriotwr36   [kernel.kallsyms]      [k]
>> entry_SYSCALL_64_after_swapgs
>> +    1.72%     0.00%  glusteriotwr34   [unknown]              [k]
>> 0xffffffffffffffff
>> +    1.72%     0.00%  glusteriotwr36   [kernel.kallsyms]      [k]
>> do_syscall_64
>> +    1.71%     0.00%  glusteriotwr45   [unknown]              [k]
>> 0xffffffffffffffff
>> +    1.70%     0.00%  glusteriotwr7    [unknown]              [k]
>> 0xffffffffffffffff
>> +    1.68%     0.00%  glusteriotwr15   [kernel.kallsyms]      [k]
>> sys_getdents
>> +    1.68%     0.00%  glusteriotwr15   [kernel.kallsyms]      [k] filldir
>> +    1.68%     0.00%  glusteriotwr15   libc-2.24.so           [.]
>> 0xffff80c60db8ef2b
>> +    1.68%     0.00%  glusteriotwr15   libc-2.24.so           [.]
>> readdir64
>> +    1.68%     0.00%  glusteriotwr15   index.so               [.]
>> 0xffff80c6192a1888
>> +    1.68%     0.00%  glusteriotwr15   [kernel.kallsyms]      [k]
>> iterate_dir
>> +    1.68%     0.00%  glusteriotwr15   [kernel.kallsyms]      [k]
>> ext4_htree_fill_tree
>> +    1.68%     0.00%  glusteriotwr15   [kernel.kallsyms]      [k]
>> ext4_readdir
>>
>> Or do you want to download the file /tmp/perf.gluster11.bricksdd1.out
>> and examine it yourself? If so i could send you a link.
>
>
> Thank you! yes a link would be great. I am not as good with kernel side of
> things. So I will have to show this information to someone else who knows
> these things so expect delay in response.
>
>>
>>
>>
>> 2018-08-21 7:13 GMT+02:00 Pranith Kumar Karampuri <pkarampu@xxxxxxxxxx>:
>> >
>> >
>> > On Tue, Aug 21, 2018 at 10:13 AM Pranith Kumar Karampuri
>> > <pkarampu@xxxxxxxxxx> wrote:
>> >>
>> >>
>> >>
>> >> On Mon, Aug 20, 2018 at 3:20 PM Hu Bert <revirii@xxxxxxxxxxxxxx> wrote:
>> >>>
>> >>> Regarding hardware the machines are identical. Intel Xeon E5-1650 v3
>> >>> Hexa-Core; 64 GB DDR4 ECC; Dell PERC H330 8 Port SAS/SATA 12 GBit/s
>> >>> RAID Controller; operating system running on a raid1, then 4 disks
>> >>> (JBOD) as bricks.
>> >>>
>> >>> Ok, i ran perf for a few seconds.
>> >>>
>> >>> ------------------------
>> >>> perf record --call-graph=dwarf -p 7897 -o
>> >>> /tmp/perf.gluster11.bricksdd1.out
>> >>> ^C[ perf record: Woken up 378 times to write data ]
>> >>> Warning:
>> >>> Processed 83690 events and lost 96 chunks!
>> >>>
>> >>> Check IO/CPU overload!
>> >>>
>> >>> [ perf record: Captured and wrote 423.087 MB
>> >>> /tmp/perf.gluster11.bricksdd1.out (51744 samples) ]
>> >>> ------------------------
>> >>>
>> >>> I copied a couple of lines:
>> >>>
>> >>> +    8.10%     0.00%  glusteriotwr22   [unknown]              [k]
>> >>> 0xffffffffffffffff
>> >>> +    8.10%     0.00%  glusteriotwr22   [kernel.kallsyms]      [k]
>> >>> iterate_dir
>> >>> +    8.10%     0.00%  glusteriotwr22   [kernel.kallsyms]      [k]
>> >>> sys_getdents
>> >>> +    8.10%     0.00%  glusteriotwr22   [kernel.kallsyms]      [k]
>> >>> filldir
>> >>> +    8.10%     0.00%  glusteriotwr22   [kernel.kallsyms]      [k]
>> >>> do_syscall_64
>> >>> +    8.10%     0.00%  glusteriotwr22   [kernel.kallsyms]      [k]
>> >>> entry_SYSCALL_64_after_swapgs
>> >>> +    8.10%     0.00%  glusteriotwr22   libc-2.24.so           [.]
>> >>> 0xffff80c60db8ef2b
>> >>> +    8.10%     0.00%  glusteriotwr22   libc-2.24.so           [.]
>> >>> readdir64
>> >>> +    8.10%     0.00%  glusteriotwr22   index.so               [.]
>> >>> 0xffff80c6192a1888
>> >>> +    8.10%     0.04%  glusteriotwr22   [kernel.kallsyms]      [k]
>> >>> ext4_htree_fill_tree
>> >>> +    8.10%     0.00%  glusteriotwr22   [kernel.kallsyms]      [k]
>> >>> ext4_readdir
>> >>> +    7.95%     0.12%  glusteriotwr22   [kernel.kallsyms]      [k]
>> >>> htree_dirblock_to_tree
>> >>> +    5.78%     0.96%  glusteriotwr22   [kernel.kallsyms]      [k]
>> >>> __ext4_read_dirblock
>> >>> +    4.80%     0.02%  glusteriotwr22   [kernel.kallsyms]      [k]
>> >>> ext4_bread
>> >>> +    4.78%     0.04%  glusteriotwr22   [kernel.kallsyms]      [k]
>> >>> ext4_getblk
>> >>> +    4.72%     0.02%  glusteriotwr22   [kernel.kallsyms]      [k]
>> >>> __getblk_gfp
>> >>> +    4.57%     0.00%  glusteriotwr3    [unknown]              [k]
>> >>> 0xffffffffffffffff
>> >>> +    4.55%     0.00%  glusteriotwr3    [kernel.kallsyms]      [k]
>> >>> do_syscall_64
>> >>>
>> >>> Do you need different or additional information?
>> >>
>> >>
>> >> This looks like there are lot of readdirs going on which is different
>> >> from
>> >> what we observed earlier, how many seconds did you do perf record for?
>> >> Will
>> >> it be possible for you to do this for some more time? may be a minute?
>> >> Just
>> >> want to be sure that the data actually represents what we are
>> >> observing.
>> >
>> >
>> > I found one code path which on lookup does readdirs. Could you give me
>> > the
>> > output of ls -l <brick-path>/.glusterfs/indices/xattrop on all the three
>> > bricks? It can probably give a correlation to see if it is indeed the
>> > same
>> > issue or not.
>> >
>> >>
>> >>
>> >>>
>> >>>
>> >>>
>> >>> 2018-08-20 11:20 GMT+02:00 Pranith Kumar Karampuri
>> >>> <pkarampu@xxxxxxxxxx>:
>> >>> > Even the brick which doesn't have high CPU seems to have same number
>> >>> > of
>> >>> > lookups, so that's not it.
>> >>> > Is there any difference at all between the machines which have high
>> >>> > CPU
>> >>> > vs
>> >>> > low CPU?
>> >>> > I think the only other thing I would do is to install perf tools and
>> >>> > try to
>> >>> > figure out the call-graph which is leading to so much CPU
>> >>> >
>> >>> > This affects performance of the brick I think, so you may have to do
>> >>> > it
>> >>> > quickly and for less time.
>> >>> >
>> >>> > perf record --call-graph=dwarf -p   <brick-pid> -o </path/to/output>
>> >>> > then
>> >>> > perf report -i </path/to/output/given/in/the/previous/command>
>> >>> >
>> >>> >
>> >>> > On Mon, Aug 20, 2018 at 2:40 PM Hu Bert <revirii@xxxxxxxxxxxxxx>
>> >>> > wrote:
>> >>> >>
>> >>> >> gluster volume heal shared info | grep -i number
>> >>> >> Number of entries: 0
>> >>> >> Number of entries: 0
>> >>> >> Number of entries: 0
>> >>> >> Number of entries: 0
>> >>> >> Number of entries: 0
>> >>> >> Number of entries: 0
>> >>> >> Number of entries: 0
>> >>> >> Number of entries: 0
>> >>> >> Number of entries: 0
>> >>> >> Number of entries: 0
>> >>> >> Number of entries: 0
>> >>> >> Number of entries: 0
>> >>> >>
>> >>> >> Looks good to me.
>> >>> >>
>> >>> >>
>> >>> >> 2018-08-20 10:51 GMT+02:00 Pranith Kumar Karampuri
>> >>> >> <pkarampu@xxxxxxxxxx>:
>> >>> >> > There are a lot of Lookup operations in the system. But I am not
>> >>> >> > able to
>> >>> >> > find why. Could you check the output of
>> >>> >> >
>> >>> >> > # gluster volume heal <volname> info | grep -i number
>> >>> >> >
>> >>> >> > it should print all zeros.
>> >>> >> >
>> >>> >> > On Fri, Aug 17, 2018 at 1:49 PM Hu Bert <revirii@xxxxxxxxxxxxxx>
>> >>> >> > wrote:
>> >>> >> >>
>> >>> >> >> I don't know what you exactly mean with workload, but the main
>> >>> >> >> function of the volume is storing (incl. writing, reading)
>> >>> >> >> images
>> >>> >> >> (from hundreds of bytes up to 30 MBs, overall ~7TB). The work is
>> >>> >> >> done
>> >>> >> >> by apache tomcat servers writing to / reading from the volume.
>> >>> >> >> Besides
>> >>> >> >> images there are some text files and binaries that are stored on
>> >>> >> >> the
>> >>> >> >> volume and get updated regularly (every x hours); we'll try to
>> >>> >> >> migrate
>> >>> >> >> the latter ones to local storage asap.
>> >>> >> >>
>> >>> >> >> Interestingly it's only one process (and its threads) of the
>> >>> >> >> same
>> >>> >> >> brick on 2 of the gluster servers that consumes the CPU.
>> >>> >> >>
>> >>> >> >> gluster11: bricksdd1; not healed; full CPU
>> >>> >> >> gluster12: bricksdd1; got healed; normal CPU
>> >>> >> >> gluster13: bricksdd1; got healed; full CPU
>> >>> >> >>
>> >>> >> >> Besides: performance during heal (e.g. gluster12, bricksdd1) was
>> >>> >> >> way
>> >>> >> >> better than it is now. I've attached 2 pngs showing the
>> >>> >> >> differing
>> >>> >> >> cpu
>> >>> >> >> usage of last week before/after heal.
>> >>> >> >>
>> >>> >> >> 2018-08-17 9:30 GMT+02:00 Pranith Kumar Karampuri
>> >>> >> >> <pkarampu@xxxxxxxxxx>:
>> >>> >> >> > There seems to be too many lookup operations compared to any
>> >>> >> >> > other
>> >>> >> >> > operations. What is the workload on the volume?
>> >>> >> >> >
>> >>> >> >> > On Fri, Aug 17, 2018 at 12:47 PM Hu Bert
>> >>> >> >> > <revirii@xxxxxxxxxxxxxx>
>> >>> >> >> > wrote:
>> >>> >> >> >>
>> >>> >> >> >> i hope i did get it right.
>> >>> >> >> >>
>> >>> >> >> >> gluster volume profile shared start
>> >>> >> >> >> wait 10 minutes
>> >>> >> >> >> gluster volume profile shared info
>> >>> >> >> >> gluster volume profile shared stop
>> >>> >> >> >>
>> >>> >> >> >> If that's ok, i've attached the output of the info command.
>> >>> >> >> >>
>> >>> >> >> >>
>> >>> >> >> >> 2018-08-17 8:31 GMT+02:00 Pranith Kumar Karampuri
>> >>> >> >> >> <pkarampu@xxxxxxxxxx>:
>> >>> >> >> >> > Please do volume profile also for around 10 minutes when
>> >>> >> >> >> > CPU%
>> >>> >> >> >> > is
>> >>> >> >> >> > high.
>> >>> >> >> >> >
>> >>> >> >> >> > On Fri, Aug 17, 2018 at 11:56 AM Pranith Kumar Karampuri
>> >>> >> >> >> > <pkarampu@xxxxxxxxxx> wrote:
>> >>> >> >> >> >>
>> >>> >> >> >> >> As per the output, all io-threads are using a lot of CPU.
>> >>> >> >> >> >> It
>> >>> >> >> >> >> is
>> >>> >> >> >> >> better
>> >>> >> >> >> >> to
>> >>> >> >> >> >> check what the volume profile is to see what is leading to
>> >>> >> >> >> >> so
>> >>> >> >> >> >> much
>> >>> >> >> >> >> work
>> >>> >> >> >> >> for
>> >>> >> >> >> >> io-threads. Please follow the documentation at
>> >>> >> >> >> >>
>> >>> >> >> >> >>
>> >>> >> >> >> >>
>> >>> >> >> >> >>
>> >>> >> >> >> >>
>> >>> >> >> >> >> https://gluster.readthedocs.io/en/latest/Administrator%20Guide/Monitoring%20Workload/
>> >>> >> >> >> >> section: "
>> >>> >> >> >> >>
>> >>> >> >> >> >> Running GlusterFS Volume Profile Command"
>> >>> >> >> >> >>
>> >>> >> >> >> >> and attach output of  "gluster volume profile info",
>> >>> >> >> >> >>
>> >>> >> >> >> >> On Fri, Aug 17, 2018 at 11:24 AM Hu Bert
>> >>> >> >> >> >> <revirii@xxxxxxxxxxxxxx>
>> >>> >> >> >> >> wrote:
>> >>> >> >> >> >>>
>> >>> >> >> >> >>> Good morning,
>> >>> >> >> >> >>>
>> >>> >> >> >> >>> i ran the command during 100% CPU usage and attached the
>> >>> >> >> >> >>> file.
>> >>> >> >> >> >>> Hopefully it helps.
>> >>> >> >> >> >>>
>> >>> >> >> >> >>> 2018-08-17 7:33 GMT+02:00 Pranith Kumar Karampuri
>> >>> >> >> >> >>> <pkarampu@xxxxxxxxxx>:
>> >>> >> >> >> >>> > Could you do the following on one of the nodes where
>> >>> >> >> >> >>> > you
>> >>> >> >> >> >>> > are
>> >>> >> >> >> >>> > observing
>> >>> >> >> >> >>> > high
>> >>> >> >> >> >>> > CPU usage and attach that file to this thread? We can
>> >>> >> >> >> >>> > find
>> >>> >> >> >> >>> > what
>> >>> >> >> >> >>> > threads/processes are leading to high usage. Do this
>> >>> >> >> >> >>> > for
>> >>> >> >> >> >>> > say
>> >>> >> >> >> >>> > 10
>> >>> >> >> >> >>> > minutes
>> >>> >> >> >> >>> > when
>> >>> >> >> >> >>> > you see the ~100% CPU.
>> >>> >> >> >> >>> >
>> >>> >> >> >> >>> > top -bHd 5 > /tmp/top.${HOSTNAME}.txt
>> >>> >> >> >> >>> >
>> >>> >> >> >> >>> > On Wed, Aug 15, 2018 at 2:37 PM Hu Bert
>> >>> >> >> >> >>> > <revirii@xxxxxxxxxxxxxx>
>> >>> >> >> >> >>> > wrote:
>> >>> >> >> >> >>> >>
>> >>> >> >> >> >>> >> Hello again :-)
>> >>> >> >> >> >>> >>
>> >>> >> >> >> >>> >> The self heal must have finished as there are no log
>> >>> >> >> >> >>> >> entries
>> >>> >> >> >> >>> >> in
>> >>> >> >> >> >>> >> glustershd.log files anymore. According to munin disk
>> >>> >> >> >> >>> >> latency
>> >>> >> >> >> >>> >> (average
>> >>> >> >> >> >>> >> io wait) has gone down to 100 ms, and disk utilization
>> >>> >> >> >> >>> >> has
>> >>> >> >> >> >>> >> gone
>> >>> >> >> >> >>> >> down
>> >>> >> >> >> >>> >> to ~60% - both on all servers and hard disks.
>> >>> >> >> >> >>> >>
>> >>> >> >> >> >>> >> But now system load on 2 servers (which were in the
>> >>> >> >> >> >>> >> good
>> >>> >> >> >> >>> >> state)
>> >>> >> >> >> >>> >> fluctuates between 60 and 100; the server with the
>> >>> >> >> >> >>> >> formerly
>> >>> >> >> >> >>> >> failed
>> >>> >> >> >> >>> >> disk has a load of 20-30.I've uploaded some munin
>> >>> >> >> >> >>> >> graphics of
>> >>> >> >> >> >>> >> the
>> >>> >> >> >> >>> >> cpu
>> >>> >> >> >> >>> >> usage:
>> >>> >> >> >> >>> >>
>> >>> >> >> >> >>> >> https://abload.de/img/gluster11_cpu31d3a.png
>> >>> >> >> >> >>> >> https://abload.de/img/gluster12_cpu8sem7.png
>> >>> >> >> >> >>> >> https://abload.de/img/gluster13_cpud7eni.png
>> >>> >> >> >> >>> >>
>> >>> >> >> >> >>> >> This can't be normal. 2 of the servers under heavy
>> >>> >> >> >> >>> >> load
>> >>> >> >> >> >>> >> and
>> >>> >> >> >> >>> >> one
>> >>> >> >> >> >>> >> not
>> >>> >> >> >> >>> >> that much. Does anyone have an explanation of this
>> >>> >> >> >> >>> >> strange
>> >>> >> >> >> >>> >> behaviour?
>> >>> >> >> >> >>> >>
>> >>> >> >> >> >>> >>
>> >>> >> >> >> >>> >> Thx :-)
>> >>> >> >> >> >>> >>
>> >>> >> >> >> >>> >> 2018-08-14 9:37 GMT+02:00 Hu Bert
>> >>> >> >> >> >>> >> <revirii@xxxxxxxxxxxxxx>:
>> >>> >> >> >> >>> >> > Hi there,
>> >>> >> >> >> >>> >> >
>> >>> >> >> >> >>> >> > well, it seems the heal has finally finished.
>> >>> >> >> >> >>> >> > Couldn't
>> >>> >> >> >> >>> >> > see/find
>> >>> >> >> >> >>> >> > any
>> >>> >> >> >> >>> >> > related log message; is there such a message in a
>> >>> >> >> >> >>> >> > specific
>> >>> >> >> >> >>> >> > log
>> >>> >> >> >> >>> >> > file?
>> >>> >> >> >> >>> >> >
>> >>> >> >> >> >>> >> > But i see the same behaviour when the last heal
>> >>> >> >> >> >>> >> > finished:
>> >>> >> >> >> >>> >> > all
>> >>> >> >> >> >>> >> > CPU
>> >>> >> >> >> >>> >> > cores are consumed by brick processes; not only by
>> >>> >> >> >> >>> >> > the
>> >>> >> >> >> >>> >> > formerly
>> >>> >> >> >> >>> >> > failed
>> >>> >> >> >> >>> >> > bricksdd1, but by all 4 brick processes (and their
>> >>> >> >> >> >>> >> > threads).
>> >>> >> >> >> >>> >> > Load
>> >>> >> >> >> >>> >> > goes
>> >>> >> >> >> >>> >> > up to > 100 on the 2 servers with the not-failed
>> >>> >> >> >> >>> >> > brick,
>> >>> >> >> >> >>> >> > and
>> >>> >> >> >> >>> >> > glustershd.log gets filled with a lot of entries.
>> >>> >> >> >> >>> >> > Load
>> >>> >> >> >> >>> >> > on
>> >>> >> >> >> >>> >> > the
>> >>> >> >> >> >>> >> > server
>> >>> >> >> >> >>> >> > with the then failed brick not that high, but still
>> >>> >> >> >> >>> >> > ~60.
>> >>> >> >> >> >>> >> >
>> >>> >> >> >> >>> >> > Is this behaviour normal? Is there some post-heal
>> >>> >> >> >> >>> >> > after
>> >>> >> >> >> >>> >> > a
>> >>> >> >> >> >>> >> > heal
>> >>> >> >> >> >>> >> > has
>> >>> >> >> >> >>> >> > finished?
>> >>> >> >> >> >>> >> >
>> >>> >> >> >> >>> >> > thx in advance :-)
>> >>> >> >> >> >>> >
>> >>> >> >> >> >>> >
>> >>> >> >> >> >>> >
>> >>> >> >> >> >>> > --
>> >>> >> >> >> >>> > Pranith
>> >>> >> >> >> >>
>> >>> >> >> >> >>
>> >>> >> >> >> >>
>> >>> >> >> >> >> --
>> >>> >> >> >> >> Pranith
>> >>> >> >> >> >
>> >>> >> >> >> >
>> >>> >> >> >> >
>> >>> >> >> >> > --
>> >>> >> >> >> > Pranith
>> >>> >> >> >
>> >>> >> >> >
>> >>> >> >> >
>> >>> >> >> > --
>> >>> >> >> > Pranith
>> >>> >> >
>> >>> >> >
>> >>> >> >
>> >>> >> > --
>> >>> >> > Pranith
>> >>> >
>> >>> >
>> >>> >
>> >>> > --
>> >>> > Pranith
>> >>
>> >>
>> >>
>> >> --
>> >> Pranith
>> >
>> >
>> >
>> > --
>> > Pranith
>
>
>
> --
> Pranith


--
Pranith



--
Milind

_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
https://lists.gluster.org/mailman/listinfo/gluster-users

[Index of Archives]     [Gluster Development]     [Linux Filesytems Development]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux