Re: Why is scsi_request_fn called every 4 milliseconds?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, 2011-01-28 at 10:42 +0800, BingJiun Luo wrote:
> On Thu, Jan 27, 2011 at 10:43 PM, James Bottomley
> <James.Bottomley@xxxxxxx> wrote:
> > On Thu, 2011-01-27 at 22:04 +0800, BingJiun Luo wrote:
> >> I want to measure SATA AHCI Host controller read performance.  Open
> >> /dev/sda and using  read(int fildes, void *buf, size_t nbyte) user space
> >> function to read 2048 times, each time 64KByets, and total 128 Mbytes.
> >>
> >> I measured the time start from one step before write CI register inside
> >> ahci_qc_issue() function until ahci_port_intr () is called in the interrupt
> >> context. It takes about 1 milliseconds to complete one 256KBytes READ
> >> DMA EXT command, and spend about 15 microseconds call to scsi_done().
> >>
> >> However, why scsi_request_fn is called about after 4 milliseconds
> >> to pass next IO request for Hardware to issue? It take less if the READ
> >> DMA command with less number of sectors.
> >
> > I'm not sure I parse the question, but I think you're asking why we
> > chain the next issue from the softirq in SCSI?  That's because most SCSI
> > devices are tagged and the bus is the bottleneck, so after processing
> > the completion, we need to get the next command out ASAP to keep the bus
> > utilised to capacity.
> 
> I observed that each time scsi_request_fn is called, scsi_dispatch_cmd
> is called
> only once and then return.  It means that only one IO request available to be
> processed by Host Contoller.

Either you're untagged, or you don't have enough I/O then.

> After time passed about 4 milliseconds,  scsi_request_fn is called
> again. Why it
> takes so long time, because the previous command already completed in only about
> 1 millisecond, including call to scsi_done(). The host controller is
> idle about 3 milliseconds,
> has nothing to do.

No idea ... it's either something to do with the setup on the
architecture or it's simply that the I/O load isn't generating multiple
commands.  On an x86 it's microseconds to reissue from block softirq.

> >
> >> My questions are:
> >> 1. Is it the time to prepare one 256 KB READ DMA EXT command by upper
> >> layer (Block Layer or Virtual File system Layer)? Or, It is the time to copy
> >> data from kernel space memory to user space memory after data is read
> >> back from Hard Drive and delay the next command pass to SCSI?
> >
> > Everything in SCSI is done with zero copy (as in we DMA straight to the
> > pagecache page, which is then attached to userspace).
> >
> Yes, I know it is zero copy at SCSI, but I am not sure at upper layer(VFS or
> anything else).
> 
> It is unlikely to zero copy between kernel space and user space
> memory buffer, right? Because no matter the data read back from disk or already
> available inside the page cache, both of them are located at kernel
> space memory,

It depends.  Glibc can play clever tricks where it services read() via
mmapped buffers.  That's zero copy.

> and this data have to be copied into user space address. All of these works are
> not done in the SCSI layer, somewhere higher than SCSI, just I don't
> know where?.

No ... the page can simply be placed into an empty userspace mapping ...
that's what we mostly try to do.

> >> I know some architecture has not good enough performance to do memcpy
> >> or something like that.
> >>
> >> 2. If I do not mount /dev/sda to any file system, what is the first
> >> kernel function
> >> called after read() function from user space? Is it located at VFS or
> >> directly to
> >> Block layer?
> >
> > I think you need to trace this for yourself ... it's complex because
> > read doesn't go to the device, it goes via the page cache, which is also
> > how the VFS operates.  If the pages are all current in the cache, a
> > read() doesn't have to trouble the disk.
> >
> I am pretty sure almost all READ DMA commands go to the disk, because
> I captured them by Catalyst Analyzer. So, if all request must go to disk, does
> it means the data not available in the page cache.

Yes ... page cache is checked first before fetching from storage.

James


> >> Because I want to keep track the time spend at the layer higher than SCSI.
> >>
> >> 3. When scsi_done() is called, what is the function to process this completed
> >> command and pass the data to user space? I think there might be somewhere
> >> inside the code to copy this data from kernel space memory address to user
> >> space memory address.
> >
> > scsi_done doesn't do anything about completion, it triggers the block
> > softirq to schedule a completion for us when all interrupts are
> > processed.
> >
> > James
> >
> >
> >


--
To unsubscribe from this list: send the line "unsubscribe linux-ide" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Filesystems]     [Linux SCSI]     [Linux RAID]     [Git]     [Kernel Newbies]     [Linux Newbie]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Samba]     [Device Mapper]

  Powered by Linux