On Mon, Mar 4, 2019 at 4:26 PM Hu Bert <revirii@xxxxxxxxxxxxxx> wrote:
Hi Raghavendra,
at the moment iowait and cpu consumption is quite low, the main
problems appear during the weekend (high traffic, especially on
sunday), so either we have to wait until next sunday or use a time
machine ;-)
I made a screenshot of top (https://abload.de/img/top-hvvjt2.jpg) and
a text output (https://pastebin.com/TkTWnqxt), maybe that helps. Seems
like processes like glfs_fuseproc (>204h) and glfs_epoll (64h for each
process) consume a lot of CPU (uptime 24 days). Is that already
helpful?
Not much. The TIME field just says the amount of time the thread has been executing. Since its a long standing mount, we can expect such large values. But, the value itself doesn't indicate whether the thread itself was overloaded at any (some) interval(s).
Can you please collect output of following command and send back the collected data?
# top -bHd 3 > top.output
Hubert
Am Mo., 4. März 2019 um 11:31 Uhr schrieb Raghavendra Gowdappa
<rgowdapp@xxxxxxxxxx>:
>
> what is the per thread CPU usage like on these clients? With highly concurrent workloads we've seen single thread that reads requests from /dev/fuse (fuse reader thread) becoming bottleneck. Would like to know what is the cpu usage of this thread looks like (you can use top -H).
>
> On Mon, Mar 4, 2019 at 3:39 PM Hu Bert <revirii@xxxxxxxxxxxxxx> wrote:
>>
>> Good morning,
>>
>> we use gluster v5.3 (replicate with 3 servers, 2 volumes, raid10 as
>> brick) with at the moment 10 clients; 3 of them do heavy I/O
>> operations (apache tomcats, read+write of (small) images). These 3
>> clients have a quite high I/O wait (stats from yesterday) as can be
>> seen here:
>>
>> client: https://abload.de/img/client1-cpu-dayulkza.png
>> server: https://abload.de/img/server1-cpu-dayayjdq.png
>>
>> The iowait in the graphics differ a lot. I checked netstat for the
>> different clients; the other clients have 8 open connections:
>> https://pastebin.com/bSN5fXwc
>>
>> 4 for each server and each volume. The 3 clients with the heavy I/O
>> have (at the moment) according to netstat 170, 139 and 153
>> connections. An example for one client can be found here:
>> https://pastebin.com/2zfWXASZ
>>
>> gluster volume info: https://pastebin.com/13LXPhmd
>> gluster volume status: https://pastebin.com/cYFnWjUJ
>>
>> I just was wondering if the iowait is based on the clients and their
>> workflow: requesting a lot of files (up to hundreds per second),
>> opening a lot of connections and the servers aren't able to answer
>> properly. Maybe something can be tuned here?
>>
>> Especially the server|client.event-threads (both set to 4) and
>> performance.(high|normal|low|least)-prio-threads (all at default value
>> 16) and performance.io-thread-count (32) options, maybe these aren't
>> properly configured for up to 170 client connections.
>>
>> Both servers and clients have a Xeon CPU (6 cores, 12 threads), a 10
>> GBit connection and 128G (servers) respectively 256G (clients) RAM.
>> Enough power :-)
>>
>>
>> Thx for reading && best regards,
>>
>> Hubert
>> _______________________________________________
>> Gluster-users mailing list
>> Gluster-users@xxxxxxxxxxx
>> https://lists.gluster.org/mailman/listinfo/gluster-users
_______________________________________________ Gluster-users mailing list Gluster-users@xxxxxxxxxxx https://lists.gluster.org/mailman/listinfo/gluster-users