On Sun, 2008-01-20 at 21:01 +0100, Jens Axboe wrote: > On Sun, Jan 20 2008, Jens Axboe wrote: > > On Sun, Jan 20 2008, Boaz Harrosh wrote: > > > On Sun, Jan 20 2008 at 21:29 +0200, Jens Axboe <jens.axboe@xxxxxxxxxx> wrote: > > > > On Sun, Jan 20 2008, James Bottomley wrote: > > > >> On Sun, 2008-01-20 at 21:18 +0200, Boaz Harrosh wrote: > > > >>> On Tue, Jan 15 2008 at 19:52 +0200, James Bottomley <James.Bottomley@xxxxxxxxxxxxxxxxxxxxx> wrote: > > > >>>> this patch depends on the sg branch of the block tree > > > >>>> > > > >>>> James > > > >>>> > > > >>>> --- > > > >>>> From: James Bottomley <James.Bottomley@xxxxxxxxxxxxxxxxxxxxx> > > > >>>> Date: Tue, 15 Jan 2008 11:11:46 -0600 > > > >>>> Subject: remove use_sg_chaining > > > >>>> > > > >>>> With the sg table code, every SCSI driver is now either chain capable > > > >>>> or broken, so there's no need to have a check in the host template. > > > >>>> > > > >>>> Also tidy up the code by moving the scatterlist size defines into the > > > >>>> SCSI includes and permit the last entry of the scatterlist pools not > > > >>>> to be a power of two. > > > >>>> --- > > > >>> I have a theoretical problem that BUGed me from the beginning. > > > >>> > > > >>> Could it happen that a memory critical IO, (that is needed to free > > > >>> memory), be collected into an sg-chained large IO, and the allocation > > > >>> of the multiple sg-pool-allocations fail, thous dead locking on > > > >>> out-of-memory? Is there a mechanism in place that will split large IO's > > > >>> into smaller chunks in the event of out-of-memory condition in prep_fn? > > > >>> > > > >>> Is it possible to call blk_rq_map_sg() with less then what is present > > > >>> at request to only map the starting portion? > > > >> Obviously, that's why I was worrying about mempool size and default > > > >> blocks a while ago. > > > >> > > > >> However, the deadlock only occurs if the device is swap or backing a > > > >> filesystem with memory mapped files. The use cases for this are really > > > >> tapes and other entities that need huge buffers. That's why we're > > > >> keeping the system sector size at 1024 unless you alter it through sysfs > > > >> (here gun, there foot ...) > > > > > > > > Alternatively (and much safer, imho), we allow blk_rq_map_sg() return > > > > smaller than nr_phys_segments and just ensure that the request is > > > > continued nicely through the normal 'request if residual' logic. > > > > > > > Thats a grate Idea. I will Q it on my todo list. Thanks > > > > ok good, thanks :-) > > btw, the above is full of typos, my apologies. it should read "requeue > if residual", but I guess you already guessed as much. Something like ... It looks to me like it would make sense to have something like a BLKPREP_SGALLOCFAIL return so the block layer can do this for us ... Alternatively, we'll have to find a way of adjusting the sector count as it goes into the ULD prep functions. James - To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html