I mentioned this idea a few weeks ago on this list: namely to allow a sg pass-through request to use the mmap-ed reserve buffer associated with another sg file descriptor. In my experience mmap-ed IO using sg's reserve buffer mapped into the user space is faster than direct IO schemes. However one shortcoming is that if you try to copy between two devices using this technique then you end up with two separate mmap-ed buffers in the user space program. Then the user space program needs to copy between the two buffers which would defeat much of the advantage of the mmap-ed IO. You could (and sgm_dd in sg3_utils does) use mmap-ed IO on the read side and direct IO on the write side (or vice versa). I used the sg driver as found in lk 2.6.21-rc4 as a baseline (and I don't think sg has changed since 2.6.19). A gzipped diff is attached. There is also some test code (a modified sgm_dd) in the sg3_utils-1.24 beta on the www.torque.net/sg site. Here is an example of a disk to disk copy: sgm_dd if=/dev/sg0 of=/dev/sg1 oflag=smmap bs=512 The new flag is 'oflag=smmap' which instructs the write SG_IO on /dev/sg1 to set SG_FLAG_SHARED_MMAP_IO and it passes the mmap-ed buffer used for /dev/sg0 in dxferp. [Add a 'verbose=1' option and it will indicate how many times shared mmap IO was requested and how many times it was actually done.] Features: - allow both side of a copy like operation to dma into and out of the same user space buffer - minimal per command overhead (i.e. building of scatter gather lists and pinning pages) - could copy a single source to multiple destinations efficiently - if shared reserve buffer unavailable (or not big enough) then fall back to indirect IO transparently - new info bit SG_INFO_SHARED_MMAP_IO indicates whether shared mmap-ed IO was done Restrictions (enforced by the sg driver): - confined to file descriptors in the same process - there can be only one user of a reserve buffer at a time - low_dma is honoured Complexity - it does have a few more corner cases than usual. For example in above sgm_dd invocation: closing /dev/sg0 while /dev/sg1 is sharing its mmap-ed reserve buffer ... Here are some timings copying between two ramdisks. It is assumed the 'bs=8k' given to dd is equivalent to 'bs=512 bpt=16' given to sgm_dd. # lsscsi -g [4:0:0:0] disk Linux scsi_debug 1.82 /dev/sda /dev/sg0 [5:0:0:0] disk Linux scsi_ses 1.06 /dev/sdb /dev/sg1 # ./dd_tsts.sh Usage: dd_tsts.sh <ifile> <ofile> <times> <bs> # ./dd_tsts.sh /dev/sda /dev/sdb 50 8k Indirect IO with dd dd if=/dev/sda of=/dev/sdb bs=8k real 0m7.448s user 0m0.080s sys 0m7.046s Direct IO with dd dd if=/dev/sda iflag=direct of=/dev/sdb oflag=direct bs=8k real 0m4.529s user 0m0.114s sys 0m3.799s # ./sg_dd_tsts.sh /dev/sg0 /dev/sg1 50 16 Indirect IO with sg_dd sg_dd if=/dev/sg0 of=/dev/sg1 bs=512 bpt=16 real 0m6.304s user 0m0.171s sys 0m5.268s Direct IO with sg_dd sg_dd if=/dev/sg0 iflag=dio of=/dev/sg1 oflag=dio bs=512 bpt=16 real 0m4.246s user 0m0.135s sys 0m3.395s Mmap read, indirect IO write with sgm_dd sgm_dd if=/dev/sg0 of=/dev/sg1 bs=512 bpt=16 real 0m4.023s user 0m0.127s sys 0m3.259s Mmap read, direct IO write with sgm_dd sgm_dd if=/dev/sg0 of=/dev/sg1 oflag=dio bs=512 bpt=16 real 0m4.057s user 0m0.164s sys 0m3.264s Mmap read, shared mmap write with sgm_dd sgm_dd if=/dev/sg0 of=/dev/sg1 oflag=smmap bs=512 bpt=16 real 0m3.871s user 0m0.131s sys 0m3.111s Don't expect drastic improvements in real IO unless it is in the gigabyte per second range. Doug Gilbert
Attachment:
sg2621rc4smm2.diff.gz
Description: GNU Zip compressed data