On Mon, 2008-03-03 at 08:59 -0600, James Bottomley wrote: > On Mon, 2008-03-03 at 16:17 +0800, Ke Wei wrote: > > On Mon, Mar 3, 2008 at 8:42 AM, James Bottomley > > <James.Bottomley@xxxxxxxxxxxxxxxxxxxxx> wrote: > > > > > > On Fri, 2008-02-29 at 12:01 -0600, James Bottomley wrote: > > > > I noticed that the current marvell sas driver wasn't performing very > > > > well. It turns out that it's setting can_queue not in the SCSI host, > > > > but in its own internal data structure, meaning it's always operating > > > > with a global queue depth of one. This patch raises it to what the code > > > > seemed to be intending ... although I think can_queue should be > > > > MVS_CHIP_SLOT_SZ - 1 (without the divide by two)? > > > > > > > > The good news is that with this change, I'm getting a respectable > > > > throughput on the fio hammer test; plus zapping random phy resets across > > > > the disk triggers error handler recovery correctly (so far). > > > > > > > > I'm having less happy results with a SATAPI DVD ... it looks like the > > > > initial IDENTIFY goes across just fine, but that we stall on the other > > > > SCSI commands ... I'm still investigating this one. > > > > > > Actually, I've run into another problem with this patch applied. It > > > looks like NCQ fails with ATA disks. What I see is that I/O goes fine > > > until I get more than one command outstanding to the device, then the > > > device stops responding. I can keep the I/O flowing if I clamp the > > > device queue depth at 1. SAS disks seem to be fine ... I can get > > > multiple outstanding commands to them correctly serviced. > > > > Yes, I have to say that testing failed when I plugged SATA and SAS > > disk. Sometimes "insmod mvsas" will cause the system to hang. > > Only look good if can_queue is set to 1. I will investigate this case. > > Thanks. For the NCQ case, it does look like turning NCQ off makes the > disk work fine, so I'd suspect some issue with NCQ handling. > > > > I'm having less happy results with a SATAPI DVD ... it looks like the > > > initial IDENTIFY goes across just fine, but that we stall on the other > > > SCSI commands ... I'm still investigating this one. > > > > I think we need set BLIST_NOREPORTLUN or some other flags (see > > scsi_devinfo.h) about new some ATAPI device.When calling > > scsi_report_lun_scan , it will bypass REPORT_LUNS command. > > It doesn't seem to be anything the DVD does ... it works fine with the > aic94xx controller doing SATAPI (it sends the correct reply to REPORT > LUNS). It looks like the first hang comes at around the second or third > Test Unit Ready. > > Traces seem to show IDENTIFY_PACKET, INQUIRY, INQUIRY, TUR, TUR (hang) > and then every following command hangs, but I'll try to instrument more > accurate tracing. OK, I instrumented more ... you're right, the first failing command is REPORT_LUNS. The failure isn't because the DVD doesn't accept the command, but because it gets errored and we fail to report back the error data. What I see is the mvsas driver returning RXQ_ERR, so the device is trying to terminate the transaction with an error code. Unfortunately, when it sees this code, mvsas does nothing at all, leaving the request to time out and be aborted (even through it already finished). I can plumb it in ... it looks like we should also be doing is calling mvs_slot_complete(), but this still isn't quite correct ... it just sets SAM_STAT_CHECK_COND ... it needs to collect the ATA error code somehow. James -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html