Jens Axboe wrote: > On Sun, Apr 29 2007, James Bottomley wrote: >> On Sun, 2007-04-29 at 18:48 +0300, Boaz Harrosh wrote: >>> FUJITA Tomonori wrote: >>>> From: Boaz Harrosh <bharrosh@xxxxxxxxxxx> >>>> Subject: [PATCH 4/4] bidi support: bidirectional request >>>> Date: Sun, 15 Apr 2007 20:33:28 +0300 >>>> >>>>> diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h >>>>> index 645d24b..16a02ee 100644 >>>>> --- a/include/linux/blkdev.h >>>>> +++ b/include/linux/blkdev.h >>>>> @@ -322,6 +322,7 @@ struct request { >>>>> void *end_io_data; >>>>> >>>>> struct request_io_part uni; >>>>> + struct request_io_part bidi_read; >>>>> }; >>>> Would be more straightforward to have: >>>> >>>> struct request_io_part in; >>>> struct request_io_part out; >>>> >>> Yes I wish I could do that. For bidi supporting drivers this is the most logical. >>> But for the 99.9% of uni-directional drivers, calling rq_uni(), and being some what on >>> the hotish paths, this means we will need a pointer to a uni request_io_part. >>> This is bad because: >>> 1st- There is no defined stage in a request life where to definitely set that pointer, >>> specially in the preparation stages. >>> 2nd- hacks like scsi_error.c/scsi_send_eh_cmnd() will not work at all. Now this is a >>> very bad spot already, and I have a short term fix for it in the SCSI-bidi patches >>> (not sent yet) but a more long term solution is needed. Once such hacks are >>> cleaned up we can do what you say. This is exactly why I use the access functions >>> rq_uni/rq_io/rq_in/rq_out and not open code access. >> I'm still not really convinced about this approach. The primary job of >> the block layer is to manage and merge READ and WRITE requests. It >> serves a beautiful secondary function of queueing for arbitrary requests >> it doesn't understand (REQ_TYPE_BLOCK_PC or REQ_TYPE_SPECIAL ... or >> indeed any non REQ_TYPE_FS). >> >> bidirectional requests fall into the latter category (there's nothing >> really we can do to merge them ... they're just transported by the block >> layer). The only unusual feature is that they carry two bios. I think >> the drivers that actually support bidirectional will be a rarity, so it >> might even be advisable to add it to the queue capability (refuse >> bidirectional requests at the top rather than perturbing all the drivers >> to process them). >> >> So, what about REQ_TYPE_BIDIRECTIONAL rather than REQ_BIDI? That will >> remove it from the standard path and put it on the special command type >> path where we can process it specially. Additionally, if you take this >> approach, you can probably simply chain the second bio through >> req->special as an additional request in the stream. The only thing >> that would then need modification would be the dequeue of the block >> driver (it would have to dequeue both requests and prepare them) and >> that needs to be done only for drivers handling bidirectional requests. > > I agree, I'm really not crazy about shuffling the entire request setup > around just for something as exotic as bidirection commands. How about > just keeping it simple - have a second request linked off the first one > for the second data phase? So keep it completely seperate, not just > overload ->special for 2nd bio list. > > So basically just add a struct request pointer, so you can do rq = > rq->next_rq or something for the next data phase. I bet this would be a > LOT less invasive as well, and we can get by with a few helpers to > support it. > > And it should definitely be a request type. > I'm a bit confused since what you both suggest is very similar to what we've proposed back in October 2006 and the impression we got was that it will be better to support bidirectional block requests natively (yet to be honest, James, you wanted a linked request all along). Before we go on that route again, how do you see the support for bidi at the scsi mid-layer done? Again, we prefer to support that officially using two struct scsi_cmnd_buff instances in struct scsi_cmnd and not as a one-off feature, using special-purpose state and logic (e.g. a linked struct scsi_cmd for the bidi_read sg list). I'm attaching the patch we sent back then for your reference. (for some reason I couldn't find the original post in the any linux-scsi archives) Regards, Benny
Support for scsi variable length CDBs and bidirectional commands Signed-off-by: Boaz Harrosh <bharrosh@panasas.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com> ==== //depot/pub/linux/block/scsi_ioctl.c#1 - /home/bharrosh/p4.local/pub/linux/block/scsi_ioctl.c ==== diff -Nup /tmp/tmp.12171.0 /home/bharrosh/p4.local/pub/linux/block/scsi_ioctl.c -L a/block/scsi_ioctl.c -L b/block/scsi_ioctl.c --- a/block/scsi_ioctl.c +++ b/block/scsi_ioctl.c @@ -32,10 +32,13 @@ #include <scsi/scsi_ioctl.h> #include <scsi/scsi_cmnd.h> -/* Command group 3 is reserved and should never be used. */ +/* + * Command group 3 is used by variable length CDBs with + * opcode VARLEN_CDB. + */ const unsigned char scsi_command_size[8] = { - 6, 10, 10, 12, + 6, 10, 10, 16, 16, 12, 10, 10 }; ==== //depot/pub/linux/drivers/scsi/scsi.c#2 - /home/bharrosh/p4.local/pub/linux/drivers/scsi/scsi.c ==== diff -Nup /tmp/tmp.12171.1 /home/bharrosh/p4.local/pub/linux/drivers/scsi/scsi.c -L a/drivers/scsi/scsi.c -L b/drivers/scsi/scsi.c --- a/drivers/scsi/scsi.c +++ b/drivers/scsi/scsi.c @@ -107,11 +107,14 @@ const char *const scsi_device_types[MAX_ "Optical Device ", "Medium Changer ", "Communications ", - "Unknown ", - "Unknown ", + "ASC IT8 ", + "ASC IT8 ", "RAID ", "Enclosure ", "Direct-Access-RBC", + "Optical card ", + "Bridge controller", + "Object storage ", }; EXPORT_SYMBOL(scsi_device_types); @@ -456,6 +459,7 @@ int scsi_dispatch_cmd(struct scsi_cmnd * unsigned long flags = 0; unsigned long timeout; int rtn = 0; + int cdb_size; /* check if the device is still usable */ if (unlikely(cmd->device->sdev_state == SDEV_DEL)) { @@ -537,9 +541,11 @@ int scsi_dispatch_cmd(struct scsi_cmnd * * Before we queue this command, check if the command * length exceeds what the host adapter can handle. */ - if (CDB_SIZE(cmd) > cmd->device->host->max_cmd_len) { - SCSI_LOG_MLQUEUE(3, - printk("queuecommand : command too long.\n")); + cdb_size = cmd->varlen_cdb ? cmd->varlen_cdb_len : CDB_SIZE(cmd); + if (cdb_size > cmd->device->host->max_cmd_len) { + SCSI_LOG_MLQUEUE(0, + printk("queuecommand : command too long. cdb_size(%d) host->max_cmd_len(%d)\n", + cdb_size, cmd->device->host->max_cmd_len)); cmd->result = (DID_ABORT << 16); scsi_done(cmd); ==== //depot/pub/linux/drivers/scsi/scsi_debug.c#2 - /home/bharrosh/p4.local/pub/linux/drivers/scsi/scsi_debug.c ==== diff -Nup /tmp/tmp.12171.2 /home/bharrosh/p4.local/pub/linux/drivers/scsi/scsi_debug.c -L a/drivers/scsi/scsi_debug.c -L b/drivers/scsi/scsi_debug.c --- a/drivers/scsi/scsi_debug.c +++ b/drivers/scsi/scsi_debug.c @@ -1794,8 +1794,11 @@ static int scsi_debug_slave_configure(st if (SCSI_DEBUG_OPT_NOISE & scsi_debug_opts) printk(KERN_INFO "scsi_debug: slave_configure <%u %u %u %u>\n", sdp->host->host_no, sdp->channel, sdp->id, sdp->lun); - if (sdp->host->max_cmd_len != SCSI_DEBUG_MAX_CMD_LEN) + if (sdp->host->max_cmd_len < SCSI_DEBUG_MAX_CMD_LEN) { + printk(KERN_INFO "scsi_debug: max_cmd_len(%d) < SCSI_DEBUG_MAX_CMD_LEN\n", + sdp->host->max_cmd_len); sdp->host->max_cmd_len = SCSI_DEBUG_MAX_CMD_LEN; + } devip = devInfoReg(sdp); sdp->hostdata = devip; if (sdp->host->cmd_per_lun) ==== //depot/pub/linux/drivers/scsi/scsi_lib.c#2 - /home/bharrosh/p4.local/pub/linux/drivers/scsi/scsi_lib.c ==== diff -Nup /tmp/tmp.12171.3 /home/bharrosh/p4.local/pub/linux/drivers/scsi/scsi_lib.c -L a/drivers/scsi/scsi_lib.c -L b/drivers/scsi/scsi_lib.c --- a/drivers/scsi/scsi_lib.c +++ b/drivers/scsi/scsi_lib.c @@ -169,7 +169,7 @@ int scsi_queue_insert(struct scsi_cmnd * * @buffer: data buffer * @bufflen: len of buffer * @sense: optional sense buffer - * @timeout: request timeout in seconds + * @timeout: request timeout in jiffies * @retries: number of times to retry request * @flags: or into request flags; * @@ -198,6 +198,11 @@ int scsi_execute(struct scsi_device *sde req->timeout = timeout; req->flags |= flags | REQ_BLOCK_PC | REQ_SPECIAL | REQ_QUIET; + /* This code path does not yet support VARLEN_CDB because we use + * scsi_io_context for the allocation of the extra cmnd space (>16) + */ + BUG_ON(req->cmd[0] == VARLEN_CDB); + /* * head injection *required* here otherwise quiesce won't work */ @@ -235,6 +240,10 @@ int scsi_execute_req(struct scsi_device EXPORT_SYMBOL(scsi_execute_req); struct scsi_io_context { + unsigned short varlen_cdb_len; + unsigned char varlen_cdb[SCSI_MAX_VARLEN_CDB_LEN]; + struct request *bidi_read_req; /* holds the read mapping */ + void *data; void (*done)(void *data, char *sense, int result, int resid); char sense[SCSI_SENSE_BUFFERSIZE]; @@ -249,6 +258,10 @@ static void scsi_end_async(struct reques if (sioc->done) sioc->done(sioc->data, sioc->sense, req->errors, req->data_len); + if (sioc->bidi_read_req) { + __blk_put_request(sioc->bidi_read_req->q, sioc->bidi_read_req); + } + kmem_cache_free(scsi_io_context_cache, sioc); __blk_put_request(req->q, req); } @@ -375,40 +388,81 @@ free_bios: * @buffer: data buffer (this can be a kernel buffer or scatterlist) * @bufflen: len of buffer * @use_sg: if buffer is a scatterlist this is the number of elements - * @timeout: request timeout in seconds + * @timeout: request timeout in jiffies * @retries: number of times to retry request - * @flags: or into request flags + * @gfp: gfp allocation flags **/ int scsi_execute_async(struct scsi_device *sdev, const unsigned char *cmd, int cmd_len, int data_direction, void *buffer, unsigned bufflen, int use_sg, int timeout, int retries, void *privdata, void (*done)(void *, char *, int, int), gfp_t gfp) { + struct scsi_cmnd_buff buff; + + buff.use_sg = use_sg; + buff.request_buffer = buffer; + buff.request_bufflen = bufflen; + + return scsi_execute_bidi_async( + sdev, cmd, cmd_len, data_direction, + &buff, NULL, + timeout, retries, privdata, done, gfp ); +} +EXPORT_SYMBOL_GPL(scsi_execute_async); + +/** + * scsi_execute_bidi_async - insert bidi request, don't wait + * @sdev: scsi device + * @cmd: scsi command + * @cmd_len: length of scsi cdb + * @data_direction: data direction + * @bidi_write_buff.buffer: data buffer (this can be a kernel buffer or scatterlist) + * @bidi_write_buff.bufflen: len of buffer + * @bidi_write_buff.use_sg: if buffer is a scatterlist this is the number of elements + * @bidi_read_buff: same as above bidi_write_buff but for the bidi read part + * @timeout: request timeout in jiffies + * @retries: number of times to retry request + * @privdata: user data passed to done function + * @done: pointer to done function called at io completion. + * signature: void done(void *user_data, char *sence, int errors, int data_bytes_advanced) + * @gfp: gfp allocation flags + **/ +int scsi_execute_bidi_async(struct scsi_device *sdev, + const unsigned char *cmd, int cmd_len, int data_direction, + struct scsi_cmnd_buff *buff, struct scsi_cmnd_buff *bidi_read_buff, + int timeout, int retries, void *privdata, + void (*done)(void *, char *, int, int), + gfp_t gfp) +{ struct request *req; struct scsi_io_context *sioc; int err = 0; - int write = (data_direction == DMA_TO_DEVICE); + int write = ((data_direction == DMA_TO_DEVICE) || (data_direction == DMA_BIDIRECTIONAL)); sioc = kmem_cache_alloc(scsi_io_context_cache, gfp); if (!sioc) return DRIVER_ERROR << 24; memset(sioc, 0, sizeof(*sioc)); - req = blk_get_request(sdev->request_queue, write, gfp); + req = blk_get_request(sdev->request_queue, write ? WRITE : READ, gfp); if (!req) goto free_sense; req->flags |= REQ_BLOCK_PC | REQ_QUIET; - if (use_sg) - err = scsi_req_map_sg(req, buffer, use_sg, bufflen, gfp); - else if (bufflen) - err = blk_rq_map_kern(req->q, req, buffer, bufflen, gfp); + if (buff->use_sg) + err = scsi_req_map_sg(req, buff->request_buffer, buff->use_sg, buff->request_bufflen, gfp); + else if (buff->request_bufflen) + err = blk_rq_map_kern(req->q, req, buff->request_buffer, buff->request_bufflen, gfp); if (err) goto free_req; - req->cmd_len = cmd_len; + req->cmd_len = min(cmd_len, MAX_COMMAND_SIZE); memcpy(req->cmd, cmd, req->cmd_len); + BUG_ON( (cmd[0]==VARLEN_CDB) && (scsi_varlen_cdb_length((void*)cmd) > cmd_len) ); + BUG_ON(sizeof(sioc->varlen_cdb) < cmd_len); + sioc->varlen_cdb_len = cmd_len; + memcpy(sioc->varlen_cdb, cmd, cmd_len); req->sense = sioc->sense; req->sense_len = 0; req->timeout = timeout; @@ -418,6 +472,32 @@ int scsi_execute_async(struct scsi_devic sioc->data = privdata; sioc->done = done; + /* bidi handling */ + BUG_ON( ((data_direction == DMA_BIDIRECTIONAL) && !bidi_read_buff) ); + BUG_ON( ((data_direction != DMA_BIDIRECTIONAL) && bidi_read_buff) ); + if (bidi_read_buff) { + struct request *bidi_req = blk_get_request(sdev->request_queue, READ, gfp); + if (!bidi_req) + goto free_req; + bidi_req->flags |= REQ_BLOCK_PC | REQ_QUIET; + /* map in the read data */ + if (bidi_read_buff->use_sg) { + err = scsi_req_map_sg(bidi_req, bidi_read_buff->request_buffer, + bidi_read_buff->use_sg, + bidi_read_buff->request_bufflen, gfp); + } else { + BUG_ON(!bidi_read_buff->request_bufflen); + err = blk_rq_map_kern(bidi_req->q, bidi_req, bidi_read_buff->request_buffer, + bidi_read_buff->request_bufflen, gfp); + } + + if (err) { + blk_put_request(bidi_req); + goto free_req; + } + sioc->bidi_read_req = bidi_req; + } + blk_execute_rq_nowait(req->q, NULL, req, 1, scsi_end_async); return 0; @@ -427,7 +507,63 @@ free_sense: kfree(sioc); return DRIVER_ERROR << 24; } -EXPORT_SYMBOL_GPL(scsi_execute_async); +EXPORT_SYMBOL_GPL(scsi_execute_bidi_async); + +struct scsi_execute_bidi_done_t { + struct completion *waiting; + char* sense; + int errors; +} ; + +static void scsi_execute_bidi_done(void *user_data, char *sense, int errors, int data_len) +{ + struct scsi_execute_bidi_done_t *sebd = user_data; + sebd->errors = errors; + if (sebd->sense) { + memcpy(sebd->sense, sense, SCSI_SENSE_BUFFERSIZE); + } + complete(sebd->waiting); +} + +/** + * scsi_execute_bidi - insert a bidi request + * @sdev: scsi device + * @cmd: scsi command + * @cmd_len: length of scsi cdb + * @data_direction: data direction + * @bidi_write_buff.buffer: data buffer (this can be a kernel buffer or scatterlist) + * @bidi_write_buff.bufflen: len of buffer + * @bidi_write_buff.use_sg: if buffer is a scatterlist this is the number of elements + * @bidi_read_buff: same as above bidi_write_buff but for the bidi read part. can be NULL + * @sense: optional sense buffer + * @timeout: request timeout in jiffies + * @retries: number of times to retry request + **/ +int scsi_execute_bidi(struct scsi_device *sdev, + const unsigned char *cmd, int cmd_len, int data_direction, + struct scsi_cmnd_buff *buff, struct scsi_cmnd_buff *bidi_read_buff, + char* sense, int timeout, int retries) +{ + int ret; + struct scsi_execute_bidi_done_t sebd; + DECLARE_COMPLETION_ONSTACK(wait); + sebd.sense = sense; + sebd.waiting = &wait; + + ret = scsi_execute_bidi_async(sdev, + cmd, cmd_len, data_direction, + buff, bidi_read_buff, + timeout, retries, + &sebd, scsi_execute_bidi_done, + __GFP_WAIT); + if (ret) + return ret; + + wait_for_completion(&wait); + + return sebd.errors; +} +EXPORT_SYMBOL_GPL(scsi_execute_bidi); /* * Function: scsi_init_cmd_errh() @@ -657,6 +793,8 @@ static struct scsi_cmnd *scsi_end_reques if (end_that_request_chunk(req, uptodate, bytes)) { int leftover = (req->hard_nr_sectors << 9); + BUG_ON( cmd->sc_data_direction == DMA_BIDIRECTIONAL ); + if (blk_pc_request(req)) leftover = req->data_len; @@ -679,6 +817,13 @@ static struct scsi_cmnd *scsi_end_reques add_disk_randomness(req->rq_disk); + /* end bidi read chunk */ + if (is_bidi_cmnd(cmd)) { + /* read was good the error is carried on main req (even for a read error) */ + end_that_request_chunk( cmd->bidi_read_sgl.request, 1, cmd->bidi_read_sgl.request->data_len); + end_that_request_last(cmd->bidi_read_sgl.request, 1); + } + spin_lock_irqsave(q->queue_lock, flags); if (blk_rq_tagged(req)) blk_queue_end_tag(q, req); @@ -693,34 +838,34 @@ static struct scsi_cmnd *scsi_end_reques return NULL; } -static struct scatterlist *scsi_alloc_sgtable(struct scsi_cmnd *cmd, gfp_t gfp_mask) +static struct scatterlist *scsi_alloc_sgtable(struct scsi_cmnd_sgl *scsgl, gfp_t gfp_mask) { struct scsi_host_sg_pool *sgp; struct scatterlist *sgl; - BUG_ON(!cmd->use_sg); + BUG_ON(!scsgl->use_sg); - switch (cmd->use_sg) { + switch (scsgl->use_sg) { case 1 ... 8: - cmd->sglist_len = 0; + scsgl->sglist_len = 0; break; case 9 ... 16: - cmd->sglist_len = 1; + scsgl->sglist_len = 1; break; case 17 ... 32: - cmd->sglist_len = 2; + scsgl->sglist_len = 2; break; #if (SCSI_MAX_PHYS_SEGMENTS > 32) case 33 ... 64: - cmd->sglist_len = 3; + scsgl->sglist_len = 3; break; #if (SCSI_MAX_PHYS_SEGMENTS > 64) case 65 ... 128: - cmd->sglist_len = 4; + scsgl->sglist_len = 4; break; #if (SCSI_MAX_PHYS_SEGMENTS > 128) case 129 ... 256: - cmd->sglist_len = 5; + scsgl->sglist_len = 5; break; #endif #endif @@ -729,7 +874,7 @@ static struct scatterlist *scsi_alloc_sg return NULL; } - sgp = scsi_sg_pools + cmd->sglist_len; + sgp = scsi_sg_pools + scsgl->sglist_len; sgl = mempool_alloc(sgp->pool, gfp_mask); return sgl; } @@ -772,6 +917,13 @@ static void scsi_release_buffers(struct */ cmd->request_buffer = NULL; cmd->request_bufflen = 0; + + if (is_bidi_cmnd(cmd)) { + if (cmd->bidi_read_sgl.use_sg) + scsi_free_sgtable(cmd->bidi_read_sgl.request_buffer, cmd->bidi_read_sgl.sglist_len); + cmd->bidi_read_sgl.request_buffer = NULL ; + cmd->bidi_read_sgl.request_bufflen = 0 ; + } } /* @@ -859,6 +1011,8 @@ void scsi_io_completion(struct scsi_cmnd if (scsi_end_request(cmd, 1, good_bytes, result == 0) == NULL) return; + BUG_ON(is_bidi_cmnd(cmd)); + /* good_bytes = 0, or (inclusive) there were leftovers and * result = 0, so scsi_end_request couldn't retry. */ @@ -968,80 +1122,129 @@ void scsi_io_completion(struct scsi_cmnd EXPORT_SYMBOL(scsi_io_completion); /* - * Function: scsi_init_io() + * Function: scsi_init_sgl() * - * Purpose: SCSI I/O initialize function. + * Purpose: SCSI I/O initialize helper. + * maps the request buffers into the given scsgl. * - * Arguments: cmd - Command descriptor we wish to initialize + * Arguments: scsgl - Command sgl we wish to initialize * * Returns: 0 on success * BLKPREP_DEFER if the failure is retryable * BLKPREP_KILL if the failure is fatal */ -static int scsi_init_io(struct scsi_cmnd *cmd) +static int scsi_init_sgl(struct scsi_cmnd_sgl *scsgl) { - struct request *req = cmd->request; + struct request *req = scsgl->request; struct scatterlist *sgpnt; int count; /* - * if this is a rq->data based REQ_BLOCK_PC, setup for a non-sg xfer - */ - if ((req->flags & REQ_BLOCK_PC) && !req->bio) { - cmd->request_bufflen = req->data_len; - cmd->request_buffer = req->data; - req->buffer = req->data; - cmd->use_sg = 0; - return 0; - } - - /* * we used to not use scatter-gather for single segment request, * but now we do (it makes highmem I/O easier to support without * kmapping pages) */ - cmd->use_sg = req->nr_phys_segments; + scsgl->use_sg = req->nr_phys_segments; /* * if sg table allocation fails, requeue request later. */ - sgpnt = scsi_alloc_sgtable(cmd, GFP_ATOMIC); + sgpnt = scsi_alloc_sgtable(scsgl, GFP_ATOMIC); if (unlikely(!sgpnt)) { scsi_unprep_request(req); return BLKPREP_DEFER; } - cmd->request_buffer = (char *) sgpnt; - cmd->request_bufflen = req->nr_sectors << 9; + scsgl->request_buffer = (char *) sgpnt; + scsgl->request_bufflen = req->nr_sectors << 9; if (blk_pc_request(req)) - cmd->request_bufflen = req->data_len; + scsgl->request_bufflen = req->data_len; req->buffer = NULL; /* * Next, walk the list, and fill in the addresses and sizes of * each segment. */ - count = blk_rq_map_sg(req->q, req, cmd->request_buffer); + count = blk_rq_map_sg(req->q, req, scsgl->request_buffer); /* * mapped well, send it off */ - if (likely(count <= cmd->use_sg)) { - cmd->use_sg = count; + if (likely(count <= scsgl->use_sg)) { + scsgl->use_sg = count; return 0; } printk(KERN_ERR "Incorrect number of segments after building list\n"); - printk(KERN_ERR "counted %d, received %d\n", count, cmd->use_sg); + printk(KERN_ERR "counted %d, received %d\n", count, scsgl->use_sg); printk(KERN_ERR "req nr_sec %lu, cur_nr_sec %u\n", req->nr_sectors, req->current_nr_sectors); - /* release the command and kill it */ - scsi_release_buffers(cmd); - scsi_put_command(cmd); return BLKPREP_KILL; } +/* + * Function: scsi_init_io() + * + * Purpose: SCSI I/O initialize function. for both main and optional bidi buffer + * + * Arguments: cmd - Command descriptor we wish to initialize + * + * Returns: 0 on success + * BLKPREP_DEFER if the failure is retryable + * BLKPREP_KILL if the failure is fatal + */ +static int scsi_init_io(struct scsi_cmnd *cmd) +{ + struct scsi_cmnd_sgl main_sgl; + struct request *req = cmd->request; + struct scsi_io_context* sioc = req->end_io_data; + int error; + + /* + * if this is a rq->data based REQ_BLOCK_PC, setup for a non-sg xfer + */ + if ((req->flags & REQ_BLOCK_PC) && !req->bio) { + cmd->request_bufflen = req->data_len; + cmd->request_buffer = req->data; + req->buffer = req->data; + cmd->use_sg = 0; + return 0; + } + + main_sgl.request = cmd->request; + + error = scsi_init_sgl(&main_sgl); + if (error) + goto err_exit; + + cmd->request_bufflen = main_sgl.request_bufflen; + cmd->request_buffer = main_sgl.request_buffer; + cmd->use_sg = main_sgl.use_sg; + cmd->sglist_len = main_sgl.sglist_len; + + if (sioc) { + if (sioc->bidi_read_req) { + cmd->bidi_read_sgl.request = sioc->bidi_read_req; + error = scsi_init_sgl(&cmd->bidi_read_sgl); + if (error) { + scsi_release_buffers(cmd); + goto err_exit; + } + } + } + + return 0 ; + +err_exit: + if (error == BLKPREP_KILL) { + /* release the command and kill it */ + scsi_release_buffers(cmd); + scsi_put_command(cmd); + } + return error; +} + static int scsi_issue_flush_fn(request_queue_t *q, struct gendisk *disk, sector_t *error_sector) { @@ -1073,6 +1276,7 @@ static void scsi_blk_pc_done(struct scsi static void scsi_setup_blk_pc_cmnd(struct scsi_cmnd *cmd) { struct request *req = cmd->request; + struct scsi_io_context* sioc = req->end_io_data; BUG_ON(sizeof(req->cmd) > sizeof(cmd->cmnd)); memcpy(cmd->cmnd, req->cmd, sizeof(cmd->cmnd)); @@ -1088,6 +1292,19 @@ static void scsi_setup_blk_pc_cmnd(struc cmd->allowed = req->retries; cmd->timeout_per_command = req->timeout; cmd->done = scsi_blk_pc_done; + + if (sioc) { + /* setup variable length cdb pointer */ + if (sioc->varlen_cdb[0] == VARLEN_CDB) { + cmd->varlen_cdb_len = sioc->varlen_cdb_len; + cmd->varlen_cdb = sioc->varlen_cdb; + } + + /* fix for bi-directional DMA flag */ + if (sioc->bidi_read_req) { + cmd->sc_data_direction = DMA_BIDIRECTIONAL; + } + } } static int scsi_prep_fn(struct request_queue *q, struct request *req) ==== //depot/pub/linux/drivers/scsi/scsi_scan.c#2 - /home/bharrosh/p4.local/pub/linux/drivers/scsi/scsi_scan.c ==== diff -Nup /tmp/tmp.12171.4 /home/bharrosh/p4.local/pub/linux/drivers/scsi/scsi_scan.c -L a/drivers/scsi/scsi_scan.c -L b/drivers/scsi/scsi_scan.c --- a/drivers/scsi/scsi_scan.c +++ b/drivers/scsi/scsi_scan.c @@ -674,6 +674,7 @@ static int scsi_add_lun(struct scsi_devi case TYPE_COMM: case TYPE_RAID: case TYPE_RBC: + case TYPE_OSD: sdev->writeable = 1; break; case TYPE_WORM: ==== //depot/pub/linux/include/scsi/scsi.h#1 - /home/bharrosh/p4.local/pub/linux/include/scsi/scsi.h ==== diff -Nup /tmp/tmp.12171.5 /home/bharrosh/p4.local/pub/linux/include/scsi/scsi.h -L a/include/scsi/scsi.h -L b/include/scsi/scsi.h --- a/include/scsi/scsi.h +++ b/include/scsi/scsi.h @@ -28,7 +28,7 @@ extern const unsigned char scsi_command_ * SCSI device types */ -#define MAX_SCSI_DEVICE_CODE 15 +#define MAX_SCSI_DEVICE_CODE 18 extern const char *const scsi_device_types[MAX_SCSI_DEVICE_CODE]; /* @@ -103,6 +103,7 @@ extern const char *const scsi_device_typ #define MODE_SENSE_10 0x5a #define PERSISTENT_RESERVE_IN 0x5e #define PERSISTENT_RESERVE_OUT 0x5f +#define VARLEN_CDB 0x7f #define REPORT_LUNS 0xa0 #define MOVE_MEDIUM 0xa5 #define EXCHANGE_MEDIUM 0xa6 @@ -223,6 +224,7 @@ static inline int scsi_status_is_good(in #define TYPE_RAID 0x0c #define TYPE_ENCLOSURE 0x0d /* Enclosure Services Device */ #define TYPE_RBC 0x0e +#define TYPE_OSD 0x11 #define TYPE_NO_LUN 0x7f /* ==== //depot/pub/linux/include/scsi/scsi_cmnd.h#2 - /home/bharrosh/p4.local/pub/linux/include/scsi/scsi_cmnd.h ==== diff -Nup /tmp/tmp.12171.6 /home/bharrosh/p4.local/pub/linux/include/scsi/scsi_cmnd.h -L a/include/scsi/scsi_cmnd.h -L b/include/scsi/scsi_cmnd.h --- a/include/scsi/scsi_cmnd.h +++ b/include/scsi/scsi_cmnd.h @@ -5,11 +5,52 @@ #include <linux/list.h> #include <linux/types.h> #include <linux/timer.h> +#include <scsi/scsi_device.h> struct request; struct scatterlist; struct scsi_device; +#define SCSI_MAX_VARLEN_CDB_LEN 260 + +/* defined in T10 SCSI Primary Commands-3 */ +struct scsi_varlen_cdb_hdr { + unsigned char opcode; /* opcode always == VARLEN_CDB */ + unsigned char control; + unsigned char misc[5]; + unsigned char additional_cdb_length; /* total cdb length - 8 */ + unsigned char service_action[2]; + /* service specific Data follows */ +}; + +static inline int +scsi_varlen_cdb_length(void *hdr) +{ + return ((struct scsi_varlen_cdb_hdr*)hdr)->additional_cdb_length + 8; +} + +/* + * This structure maps data buffers into a scatter-gather list for DMA purposes. + * Embedded in struct scsi_cmnd. + * + * FIXME: We currently embed this structure in scsi_cmnd only for + * bidi read buffers. Buffers for uni-directional commands and write + * buffers of bidi commands are mapped in a backward compatible way by an + * equivalent set of fields, scattered in the scsi_cmnd. + * These should be incorporated into an instance of scsi_cmn_sgl. + * This will require a major rework of most scsi LLDDs. + * + * We need a pointer to the request structure for the req->bio mapping. + * when struct request supports bidi transfers this pointer should go away. + */ +struct scsi_cmnd_sgl { + unsigned short use_sg; /* Number of pieces of scatter-gather */ + unsigned short sglist_len; /* size of malloc'd scatter-gather list */ + void *request_buffer; /* Actual requested buffer */ + unsigned request_bufflen; /* Actual request size */ + + struct request *request; /* The request we are working on from block layer*/ +}; /* embedded in scsi_cmnd */ struct scsi_pointer { @@ -57,12 +98,15 @@ struct scsi_cmnd { int allowed; int timeout_per_command; - unsigned char cmd_len; + unsigned char cmd_len; /* fixed cdb command length (<= 16) */ + unsigned short varlen_cdb_len; /* length of varlen_cdb buffer */ enum dma_data_direction sc_data_direction; /* These elements define the operation we are about to perform */ #define MAX_COMMAND_SIZE 16 unsigned char cmnd[MAX_COMMAND_SIZE]; + unsigned char *varlen_cdb; /* an optional variable-length cdb. + first 16 bytes are copied also into cmnd[] */ unsigned request_bufflen; /* Actual request size */ struct timer_list eh_timeout; /* Used to time out the command. */ @@ -116,8 +160,16 @@ struct scsi_cmnd { unsigned char tag; /* SCSI-II queued command tag */ unsigned long pid; /* Process ID, starts at 0. Unique per host. */ + + /* + * map read buffers of bi-directional commands + * bidi_read_sgl.request != NULL iff sc_data_direction==DMA_BIDIRECTIONAL + */ + struct scsi_cmnd_sgl bidi_read_sgl; }; +#define is_bidi_cmnd(cmd) (cmd->bidi_read_sgl.request != NULL) + /* * These are the values that scsi_cmd->state can take. */ @@ -131,6 +183,30 @@ struct scsi_cmnd { #define SCSI_STATE_BHQUEUE 0x100a #define SCSI_STATE_MLQUEUE 0x100b +/* + * these inline helpers take into account the scsi cmnd's data direction + * to correctly find sg maps for uni-directional and bi-directional commands + */ +static inline void +scsi_get_out_buff(struct scsi_cmnd *sc, struct scsi_cmnd_buff *scb) +{ + scb->use_sg = sc->use_sg; + scb->request_buffer = sc->request_buffer; + scb->request_bufflen = sc->request_bufflen; +} + +static inline void +scsi_get_in_buff(struct scsi_cmnd *sc, struct scsi_cmnd_buff *scb) +{ + if (!is_bidi_cmnd(sc)) { + scsi_get_out_buff(sc, scb); + return; + } + + scb->use_sg = sc->bidi_read_sgl.use_sg; + scb->request_buffer = sc->bidi_read_sgl.request_buffer; + scb->request_bufflen = sc->bidi_read_sgl.request_bufflen; +} extern struct scsi_cmnd *scsi_get_command(struct scsi_device *, gfp_t); extern void scsi_put_command(struct scsi_cmnd *); ==== //depot/pub/linux/include/scsi/scsi_device.h#1 - /home/bharrosh/p4.local/pub/linux/include/scsi/scsi_device.h ==== diff -Nup /tmp/tmp.12171.7 /home/bharrosh/p4.local/pub/linux/include/scsi/scsi_device.h -L a/include/scsi/scsi_device.h -L b/include/scsi/scsi_device.h --- a/include/scsi/scsi_device.h +++ b/include/scsi/scsi_device.h @@ -198,6 +198,13 @@ static inline struct scsi_target *scsi_t #define starget_printk(prefix, starget, fmt, a...) \ dev_printk(prefix, &(starget)->dev, fmt, ##a) +/* used by APIs as a shorthand for <use_sg, buffer, bufflen> */ +struct scsi_cmnd_buff { + unsigned short use_sg; /* Number of pieces of scatter-gather */ + void *request_buffer; /* if use_sg==0 requested buffer else an sg list */ + unsigned request_bufflen; /* Actual request size */ +}; + extern struct scsi_device *__scsi_add_device(struct Scsi_Host *, uint, uint, uint, void *hostdata); extern int scsi_add_device(struct Scsi_Host *host, uint channel, @@ -297,6 +304,18 @@ extern int scsi_execute_async(struct scs int timeout, int retries, void *privdata, void (*done)(void *, char *, int, int), gfp_t gfp); +extern int scsi_execute_bidi_async(struct scsi_device *sdev, + const unsigned char *cmd, int cmd_len, int data_direction, + struct scsi_cmnd_buff *buff, + struct scsi_cmnd_buff *bidi_read_buff, + int timeout, int retries, void *privdata, + void (*done)(void *, char *, int, int), + gfp_t gfp); +extern int scsi_execute_bidi(struct scsi_device *sdev, + const unsigned char *cmd, int cmd_len, int data_direction, + struct scsi_cmnd_buff *buff, + struct scsi_cmnd_buff *bidi_read_buff, + char* sense, int timeout, int retries); static inline void scsi_device_reprobe(struct scsi_device *sdev) { ==== //depot/pub/linux/include/scsi/scsi_host.h#2 - /home/bharrosh/p4.local/pub/linux/include/scsi/scsi_host.h ==== diff -Nup /tmp/tmp.12171.8 /home/bharrosh/p4.local/pub/linux/include/scsi/scsi_host.h -L a/include/scsi/scsi_host.h -L b/include/scsi/scsi_host.h --- a/include/scsi/scsi_host.h +++ b/include/scsi/scsi_host.h @@ -500,13 +500,11 @@ struct Scsi_Host { /* * The maximum length of SCSI commands that this host can accept. * Probably 12 for most host adapters, but could be 16 for others. + * or 260 if the driver supports variable length cdbs. * For drivers that don't set this field, a value of 12 is - * assumed. I am leaving this as a number rather than a bit - * because you never know what subsequent SCSI standards might do - * (i.e. could there be a 20 byte or a 24-byte command a few years - * down the road?). + * assumed. */ - unsigned char max_cmd_len; + unsigned short max_cmd_len; int this_id; int can_queue;