On Sun, Jan 20 2008 at 22:59 +0200, James Bottomley <James.Bottomley@xxxxxxxxxxxxxxxxxxxxx> wrote: > On Sun, 2008-01-20 at 21:01 +0100, Jens Axboe wrote: >> On Sun, Jan 20 2008, Jens Axboe wrote: >>> On Sun, Jan 20 2008, Boaz Harrosh wrote: >>>> On Sun, Jan 20 2008 at 21:29 +0200, Jens Axboe <jens.axboe@xxxxxxxxxx> wrote: >>>>> On Sun, Jan 20 2008, James Bottomley wrote: >>>>>> On Sun, 2008-01-20 at 21:18 +0200, Boaz Harrosh wrote: >>>>>>> On Tue, Jan 15 2008 at 19:52 +0200, James Bottomley <James.Bottomley@xxxxxxxxxxxxxxxxxxxxx> wrote: >>>>>>>> this patch depends on the sg branch of the block tree >>>>>>>> >>>>>>>> James >>>>>>>> >>>>>>>> --- >>>>>>>> From: James Bottomley <James.Bottomley@xxxxxxxxxxxxxxxxxxxxx> >>>>>>>> Date: Tue, 15 Jan 2008 11:11:46 -0600 >>>>>>>> Subject: remove use_sg_chaining >>>>>>>> >>>>>>>> With the sg table code, every SCSI driver is now either chain capable >>>>>>>> or broken, so there's no need to have a check in the host template. >>>>>>>> >>>>>>>> Also tidy up the code by moving the scatterlist size defines into the >>>>>>>> SCSI includes and permit the last entry of the scatterlist pools not >>>>>>>> to be a power of two. >>>>>>>> --- >>>>>>> I have a theoretical problem that BUGed me from the beginning. >>>>>>> >>>>>>> Could it happen that a memory critical IO, (that is needed to free >>>>>>> memory), be collected into an sg-chained large IO, and the allocation >>>>>>> of the multiple sg-pool-allocations fail, thous dead locking on >>>>>>> out-of-memory? Is there a mechanism in place that will split large IO's >>>>>>> into smaller chunks in the event of out-of-memory condition in prep_fn? >>>>>>> >>>>>>> Is it possible to call blk_rq_map_sg() with less then what is present >>>>>>> at request to only map the starting portion? >>>>>> Obviously, that's why I was worrying about mempool size and default >>>>>> blocks a while ago. >>>>>> >>>>>> However, the deadlock only occurs if the device is swap or backing a >>>>>> filesystem with memory mapped files. The use cases for this are really >>>>>> tapes and other entities that need huge buffers. That's why we're >>>>>> keeping the system sector size at 1024 unless you alter it through sysfs >>>>>> (here gun, there foot ...) >>>>> Alternatively (and much safer, imho), we allow blk_rq_map_sg() return >>>>> smaller than nr_phys_segments and just ensure that the request is >>>>> continued nicely through the normal 'request if residual' logic. >>>>> >>>> Thats a grate Idea. I will Q it on my todo list. Thanks >>> ok good, thanks :-) >> btw, the above is full of typos, my apologies. it should read "requeue >> if residual", but I guess you already guessed as much. > > Something like ... > > It looks to me like it would make sense to have something like a > BLKPREP_SGALLOCFAIL return so the block layer can do this for us ... > Alternatively, we'll have to find a way of adjusting the sector count as > it goes into the ULD prep functions. > > James By luck this is no problem because it happens exactly before the ULD actually prepares the command. sd and sr are already doing these adjustments based on bufflen. For BLOCK_PC we will need to fail with perhaps a new BLKPREP_SGALLOCFAIL, like you said, and let the initiator take care of it. Boaz - To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html