How do I Improve Large Sequential Read Performance to a SCSI Block Device?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



How do I improve read performance for large sequential IO to a SCSI block device on Linux? In most (if not all) "out of the box" Linux distros, write performance far exceeds read performance for large sequential IO to a block device. However, read and write performance are about equal using a character device (sg). The IO
using a character device is larger and more commands are sent to the SCSI
device.

What kind of tuning parameters or patches should be done to improve sequential read performance? Should I be using a different IO elevator or none at all? Is
my block device doing direct IO? How would I know?

I have not been able to find a good solution in any searches.

Here are my system details:

SCSI device:
DataDirect Networks S2A9500 Controller (FC-4) w/ 4 TB of FC disks.
LUN 0 - 4 TB w/ 512 byte block size

Computer:
Dell 2850 Server Dual Xeon 3.00 GHz
1G Memory
2 Emulex Dual LP11000 HBAs (Driver 8.0.13), only using one FC Port.
racerx:/proc/scsi # lsscsi -vvg
sysfsroot: /sys
[0:0:0:0]    disk    SEAGATE  ST336754LC       D402  /dev/sda  /dev/sg0
dir: /sys/bus/scsi/devices/0:0:0:0 [/sys/devices/pci0000:00/0000:00:02.0/0000:01:00.0/0000:02:05.0/host0/target0:0:0/0:0:0:0]
[0:0:6:0]    process PE/PV    1x6 SCSI BP      1.0   -         /dev/sg1
dir: /sys/bus/scsi/devices/0:0:6:0 [/sys/devices/pci0000:00/0000:00:02.0/0000:01:00.0/0000:02:05.0/host0/target0:0:6/0:0:6:0]
[9:0:0:0]    disk    DDN      S2A 9500         3.00  /dev/sdb  /dev/sg2
dir: /sys/bus/scsi/devices/9:0:0:0 [/sys/devices/pci0000:00/0000:00:06.0/0000:08:00.2/0000:0a:03.1/host9/target9:0:0/9:0:0:0]

OS:
Suse 9.3 x84-64 w/ updates
racerx:~ # uname -a
Linux racerx 2.6.11.4-21.10-smp #1 SMP Tue Nov 29 14:32:49 UTC 2005 x86_64 x86_64 x86_64 GNU/Linux Emulex driver version 8.0.13 (not the latest, but good performance can be achieved)
No file system is used to test the SCSI device.

Looking at /dev/sdb parameters:

racerx:/sys/block/sdb/queue # ls
.   iosched            max_sectors_kb  read_ahead_kb
..  max_hw_sectors_kb  nr_requests     scheduler
racerx:/sys/block/sdb/queue # cat scheduler
noop [anticipatory] deadline cfq
racerx:/sys/block/sdb/queue # cat max_sectors_kb
512
racerx:/sys/block/sdb/queue # cat read_ahead_kb
128
racerx:/sys/block/sdb/queue # cat max_hw_sectors_kb
512
racerx:/sys/block/sdb/queue # cat nr_requests
128
racerx:/sys/block/sdb/queue # cd ..
racerx:/sys/block/sdb # ls
.  ..  dev  device  queue  range  removable  sdb1  size  stat
racerx:/sys/block/sdb # cat size
9175695360
racerx:/sys/block/sdb # cat range
16
racerx:/sys/block/sdb # cat stat
41800102 1120178689 707930979 242236415 14078051 1138277748 625707891 310226730 0 42758380 552574197

I figure these might be the tuning parameters I'm looking for. But there may be others as well. I have an idea of what may work, but I'd like to hear from the experts. What kinds of numbers should I use to increase large sequential read
performace? How do I make these numbers persistent?

Tests and results:

Read performance using the block device (/dev/sdb):

Using the command:
sgp_dd if=/dev/sdb of=/dev/null bs=512 bpt=4096 time=1 thr=6 count=100000000 dio=1

BTW: The dio=1 flag really does not affect the results to the block device.

I'm asking for 2M transfers.

S2A 9500[1]: stats length

                        Command Length Statistics

Length        Port 1            Port 2            Port 3            Port 4
Kbytes Reads Writes Reads Writes Reads Writes Reads Writes > 0 0 0 0 0 0 0 0 0 > 16 0 0 0 0 0 0 0 0 > 32 0 0 0 0 0 0 0 0 > 48 0 0 0 0 0 0 0 0 > 64 0 0 0 0 0 0 0 0 > 80 0 0 0 0 0 0 0 0 > 96 0 0 0 0 0 0 0 0 > 112 0 0 0 0 0 0 0 0 > 128 0 0 0 0 0 0 0 0 > 144 0 0 0 0 0 0 0 0 > 160 0 0 0 0 0 0 0 0 > 176 0 0 0 0 0 0 0 0 > 192 0 0 0 0 0 0 0 0 > 208 0 0 0 0 0 0 0 0 > 224 0 0 0 0 0 0 0 0 > 240 0 0 0 0 0 0 0 0 > 256 17F0 0 0 0 0 0 0 0
S2A 9500[1]: stats

                         System Performance Statistics
               All Ports     Port 1     Port 2     Port 3     Port 4
 Read  MB/s:      145.9      145.9        0.0        0.0        0.0
 Write MB/s:        0.0        0.0        0.0        0.0        0.0
 Total MB/s:      145.9      145.9        0.0        0.0        0.0

 Read  IO/s:        583        583          0          0          0
 Write IO/s:          0          0          0          0          0
 Total IO/s:        583        583          0          0          0

 Read Hits:       100.0%     100.0%       0.0%       0.0%       0.0%
 Prefetch Hits:   100.0%     100.0%       0.0%       0.0%       0.0%
 Prefetches:       20.0%      20.0%       0.0%       0.0%       0.0%
 Writebacks:        0.0%       0.0%       0.0%       0.0%       0.0%
 Rebuild MB/s:      0.0        0.0                   0.0
 Verify MB/s:       0.0        0.0                   0.0

                   Total      Reads     Writes
 Disk IO/s:          145        145          0
 Disk MB/s:        163.9      163.9        0.0
 Disk Pieces:       1869       1869          0
 BDB Pieces:                      0

  Cache Writeback Data:     0.0%
  Rebuild/Verify Data:      0.0%    0.0%
  Cache Data locked:        0.0%
S2A 9500[1]:

Taking snapshots of outstanding Host IO from the S2A9500 only shows a max of 1 small (256K) command outstanding at any point in time. There's alot of idle time
here.

Write performance using the block device:

Using the command:
sgp_dd if=/dev/zero of=/dev/sdb bs=512 bpt=4096 time=1 thr=6 count=100000000 dio=1

S2A 9500[1]: stats length

                        Command Length Statistics

Length        Port 1            Port 2            Port 3            Port 4
Kbytes Reads Writes Reads Writes Reads Writes Reads Writes > 0 0 8 0 0 0 0 0 0 > 16 0 0 0 0 0 0 0 0 > 32 0 0 0 0 0 0 0 0 > 48 0 0 0 0 0 0 0 0 > 64 0 0 0 0 0 0 0 0 > 80 0 0 0 0 0 0 0 0 > 96 0 0 0 0 0 0 0 0 > 112 0 0 0 0 0 0 0 0 > 128 0 0 0 0 0 0 0 0 > 144 0 0 0 0 0 0 0 0 > 160 0 0 0 0 0 0 0 0 > 176 0 0 0 0 0 0 0 0 > 192 0 0 0 0 0 0 0 0 > 208 0 0 0 0 0 0 0 0 > 224 0 0 0 0 0 0 0 0 > 240 0 0 0 0 0 0 0 0 > 384 0 A 0 0 0 0 0 0 > 400 0 5 0 0 0 0 0 0 > 416 0 B 0 0 0 0 0 0 > 432 0 B 0 0 0 0 0 0 > 448 0 11 0 0 0 0 0 0 > 464 0 11 0 0 0 0 0 0 > 480 0 10 0 0 0 0 0 0 > 496 0 14 0 0 0 0 0 0 > 512 0 56EA 0 0 0 0 0 0
S2A 9500[1]: stats

                         System Performance Statistics
               All Ports     Port 1     Port 2     Port 3     Port 4
 Read  MB/s:        0.0        0.0        0.0        0.0        0.0
 Write MB/s:      385.9      385.9        0.0        0.0        0.0
 Total MB/s:      385.9      385.9        0.0        0.0        0.0

 Read  IO/s:          0          0          0          0          0
 Write IO/s:        772        772          0          0          0
 Total IO/s:        772        772          0          0          0

 Read Hits:         0.0%       0.0%       0.0%       0.0%       0.0%
 Prefetch Hits:     0.0%       0.0%       0.0%       0.0%       0.0%
 Prefetches:        0.0%       0.0%       0.0%       0.0%       0.0%
 Writebacks:      100.0%     100.0%       0.0%       0.0%       0.0%
 Rebuild MB/s:      0.0        0.0                   0.0
 Verify MB/s:       0.0        0.0                   0.0

                   Total      Reads     Writes
 Disk IO/s:           30          0         30
 Disk MB/s:        432.1        0.0      432.1
 Disk Pieces:      12414          0      12414
 BDB Pieces:                      0

  Cache Writeback Data:     7.4%
  Rebuild/Verify Data:      0.0%    0.0%
  Cache Data locked:        0.0%

Still did not get 2M IO, but the command sizes are larger (mostly 512K) and
there are usually 16 commands outstanding on the S2A9500 at any one time.

Read performance using the character device (/dev/sg2):

Using the command:
racerx:~ # sgp_dd if=/dev/sg2 of=/dev/null bs=512 bpt=4096 time=1 thr=6 count=100000000 dio=1
time to transfer data was 125.323676 secs, 408.54 MB/sec
100000000+0 records in
100000000+0 records out
>> Direct IO requested but incomplete 24415 times
>>> /proc/scsi/sg/allow_dio set to '0' but should be set to '1' for direct IO

Interesting message. Was I actually getting direct IO? Should I set
/proc/scsi/sg/allow_dio to 1? How do I make that persistent?

S2A 9500[1]: stats length

                        Command Length Statistics

Length        Port 1            Port 2            Port 3            Port 4
Kbytes Reads Writes Reads Writes Reads Writes Reads Writes > 0 0 0 0 0 0 0 0 0 > 16 0 0 0 0 0 0 0 0 > 32 0 0 0 0 0 0 0 0 > 48 0 0 0 0 0 0 0 0 > 64 0 0 0 0 0 0 0 0 > 80 0 0 0 0 0 0 0 0 > 96 0 0 0 0 0 0 0 0 > 112 0 0 0 0 0 0 0 0 > 128 0 0 0 0 0 0 0 0 > 144 0 0 0 0 0 0 0 0 > 160 0 0 0 0 0 0 0 0 > 176 0 0 0 0 0 0 0 0 > 192 0 0 0 0 0 0 0 0 > 208 0 0 0 0 0 0 0 0 > 224 0 0 0 0 0 0 0 0 > 240 0 0 0 0 0 0 0 0 > 2048 B34 0 0 0 0 0 0 0
S2A 9500[1]: stats

                         System Performance Statistics
               All Ports     Port 1     Port 2     Port 3     Port 4
 Read  MB/s:      389.9      389.9        0.0        0.0        0.0
 Write MB/s:        0.0        0.0        0.0        0.0        0.0
 Total MB/s:      389.9      389.9        0.0        0.0        0.0

 Read  IO/s:        194        194          0          0          0
 Write IO/s:          0          0          0          0          0
 Total IO/s:        194        194          0          0          0

 Read Hits:       100.0%     100.0%       0.0%       0.0%       0.0%
 Prefetch Hits:   100.0%     100.0%       0.0%       0.0%       0.0%
 Prefetches:       50.0%      50.0%       0.0%       0.0%       0.0%
 Writebacks:        0.0%       0.0%       0.0%       0.0%       0.0%
 Rebuild MB/s:      0.0        0.0                   0.0
 Verify MB/s:       0.0        0.0                   0.0

                   Total      Reads     Writes
 Disk IO/s:          194        194          0
 Disk MB/s:        438.5      438.5        0.0
 Disk Pieces:       6306       6306          0
 BDB Pieces:                      0

  Cache Writeback Data:     0.0%
  Rebuild/Verify Data:      0.0%    0.0%
  Cache Data locked:        0.0%

We got 2M reads and the S2A9500 shows between 5 and 6 2M commands outstanding on
the S2A9500 at any time.

Write performance using the character device (/dev/sg2):

Using the command:
racerx:~ # sgp_dd if=/dev/zero of=/dev/sg2 bs=512 bpt=4096 time=1 thr=6 count=100000000 dio=1
time to transfer data was 125.809450 secs, 406.96 MB/sec
100000000+0 records in
100000000+0 records out
>> Direct IO requested but incomplete 24415 times
>>> /proc/scsi/sg/allow_dio set to '0' but should be set to '1' for direct IO

S2A 9500[1]: stats length

                        Command Length Statistics

Length        Port 1            Port 2            Port 3            Port 4
Kbytes Reads Writes Reads Writes Reads Writes Reads Writes > 0 0 0 0 0 0 0 0 0 > 16 0 0 0 0 0 0 0 0 > 32 0 0 0 0 0 0 0 0 > 48 0 0 0 0 0 0 0 0 > 64 0 0 0 0 0 0 0 0 > 80 0 0 0 0 0 0 0 0 > 96 0 0 0 0 0 0 0 0 > 112 0 0 0 0 0 0 0 0 > 128 0 0 0 0 0 0 0 0 > 144 0 0 0 0 0 0 0 0 > 160 0 0 0 0 0 0 0 0 > 176 0 0 0 0 0 0 0 0 > 192 0 0 0 0 0 0 0 0 > 208 0 0 0 0 0 0 0 0 > 224 0 0 0 0 0 0 0 0 > 240 0 0 0 0 0 0 0 0 > 2048 0 877 0 0 0 0 0 0
S2A 9500[1]: stats

                         System Performance Statistics
               All Ports     Port 1     Port 2     Port 3     Port 4
 Read  MB/s:        0.0        0.0        0.0        0.0        0.0
 Write MB/s:      387.8      387.8        0.0        0.0        0.0
 Total MB/s:      387.8      387.8        0.0        0.0        0.0

 Read  IO/s:          0          0          0          0          0
 Write IO/s:        194        194          0          0          0
 Total IO/s:        194        194          0          0          0

 Read Hits:         0.0%       0.0%       0.0%       0.0%       0.0%
 Prefetch Hits:     0.0%       0.0%       0.0%       0.0%       0.0%
 Prefetches:        0.0%       0.0%       0.0%       0.0%       0.0%
 Writebacks:      100.0%     100.0%       0.0%       0.0%       0.0%
 Rebuild MB/s:      0.0        0.0                   0.0
 Verify MB/s:       0.0        0.0                   0.0

                   Total      Reads     Writes
 Disk IO/s:           30          0         30
 Disk MB/s:        437.5        0.0      437.5
 Disk Pieces:       4932          0       4932
 BDB Pieces:                      0

  Cache Writeback Data:     8.1%
  Rebuild/Verify Data:      0.0%    0.0%
  Cache Data locked:        0.0%

Same as the reads. 2M IO and between 5 and 6 commands outstanding on the S2A9500
at any time.

Any ideas would be appreciated,
Martin Schlining
mschlining@xxxxxxxxxxxxxxxxx










-
: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [SCSI Target Devel]     [Linux SCSI Target Infrastructure]     [Kernel Newbies]     [IDE]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux ATA RAID]     [Linux IIO]     [Samba]     [Device Mapper]
  Powered by Linux