Hi,
yeah your right, wrong file size:
Here are the test results with a 2048MB file size. The Raid itself holds
1024MB Cache in RAM.
# tiotest -f 2048
Tiotest results for 4 concurrent io threads:
,----------------------------------------------------------------------.
| Item | Time | Rate | Usr CPU | Sys CPU |
+-----------------------+----------+--------------+----------+---------+
| Write 8192 MBs | 58.6 s | 139.801 MB/s | 28.0 % | 987.1 % |
| Random Write 16 MBs | 1.9 s | 8.435 MB/s | 0.9 % | 29.6 % |
| Read 8192 MBs | 54.9 s | 149.176 MB/s | 16.6 % | 171.7 % |
| Random Read 16 MBs | 10.4 s | 1.509 MB/s | 0.2 % | 4.1 % |
`----------------------------------------------------------------------'
Tiotest latency results:
,-------------------------------------------------------------------------.
| Item | Average latency | Maximum latency | % >2 sec | % >10 sec |
+--------------+-----------------+-----------------+----------+-----------+
| Write | 0.108 ms | 980.220 ms | 0.00000 | 0.00000 |
| Random Write | 1.237 ms | 198.483 ms | 0.00000 | 0.00000 |
| Read | 0.104 ms | 185.499 ms | 0.00000 | 0.00000 |
| Random Read | 10.178 ms | 116.995 ms | 0.00000 | 0.00000 |
|--------------+-----------------+-----------------+----------+-----------|
| Total | 0.117 ms | 980.220 ms | 0.00000 | 0.00000 |
`--------------+-----------------+-----------------+----------+-----------'
So the IO is mainly the same on all nodes with the right file size, but
the question still is why is the random read/write performance so bad !!
More Infos about the Systems as:
Each Node got 2048MB RAM and dual Xeon CPU.
As FC Controller we are using are QLogic Corp. QLA2312
As a Switch and for fencing the Qlogic 5202.
The Raid itself is an easyRAID Q16+ with 16 Disk and it performance very
well under eg XFS.
Any further hints ?
--
----
Frank Schliefer
Kovacs, Corey J. schrieb:
Also, I think it might be interesting to see what happens when you use data
sizes that
will overrun any cacheing being done. I've seen great performance using a
simple MSA1000
as long as there is a lot of cache available on the SAN itself. As soon as I
run tests with
data sets larger then the cache size, the performance falls to the floor.
Unless your over
loading the cache, you might not be getting a true metric of whats really
getting written
to disk.
Maybe the slow node is getting hit by cache overhead from the SAN?
Just a thought
Corey
-----Original Message-----
From: linux-cluster-bounces@xxxxxxxxxx
[mailto:linux-cluster-bounces@xxxxxxxxxx] On Behalf Of Patrick Caulfield
Sent: Thursday, February 09, 2006 9:18 AM
To: linux clustering
Subject: Re: Node lag
Frank Schliefer wrote:
Hi,
after setting up an four node cluster we have one node that is way
slower than the other 3 nodes.
We using eg. tiotest for benchmarking the GFS.
Normal Node:
Tiotest results for 4 concurrent io threads:
,----------------------------------------------------------------------.
| Item | Time | Rate | Usr CPU | Sys CPU |
+-----------------------+----------+--------------+----------+---------+
| Write 40 MBs | 0.2 s | 227.426 MB/s | 36.4 % | 384.4 % |
| Random Write 16 MBs | 0.1 s | 143.405 MB/s | 58.7 % | 146.9 % |
| Read 40 MBs | 0.0 s | 2558.199 MB/s | 307.0 % | 1228.0 % |
| Random Read 16 MBs | 0.0 s | 2685.169 MB/s | 550.0 % | 1374.9 % |
`----------------------------------------------------------------------'
Slow Node:
Tiotest results for 4 concurrent io threads:
,----------------------------------------------------------------------.
| Item | Time | Rate | Usr CPU | Sys CPU |
+-----------------------+----------+--------------+----------+---------+
| Write 40 MBs | 1.4 s | 27.687 MB/s | 2.2 % | 121.8 % |
| Random Write 16 MBs | 4.2 s | 3.695 MB/s | 0.0 % | 7.9 % |
| Read 40 MBs | 0.0 s | 2228.288 MB/s | 89.1 % | 1337.1 % |
| Random Read 16 MBs | 0.0 s | 2252.739 MB/s | 230.7 % | 692.1 % |
`----------------------------------------------------------------------'
any hints why this could happen ??
Using kernel 2.6.15.2 (sorry no RH)
It would be helpful if you could give us more information about your
installation: disk topology, lock manager in use (and which nodes are
lockservers if using GULM) and whether it matters which nodes are started
first or not.
--
Linux-cluster@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/linux-cluster