Seokmann Ju wrote: > On Oct 27, 2008, at 1:20 AM, Boaz Harrosh wrote: > >> FUJITA Tomonori wrote: >>> On Sun, 26 Oct 2008 11:38:04 +0200 >>> Boaz Harrosh <bharrosh@xxxxxxxxxxx> wrote: >>> >>>> FUJITA Tomonori wrote: >>>>> CC'ed Jens, >>>>> >>>>> On Wed, 22 Oct 2008 19:27:35 -0700 >>>>> Seokmann Ju <seokmann.ju@xxxxxxxxxx> wrote: >>>>> >>>>>> And it seems like that the panic is happening due to the fact that >>>>>> blk_delete_timer() is not called upon having completion of the >>>>>> service. >>>>>> In other words, the block layer calls blk_add_timer() prior to >>>>>> dispatch the service but, it doesn't call blk_delete_timer() >>>>>> when it >>>>>> returned. >>>>> Yeah, we need to call blk_delete_timer somewhere. >>>>> >>>>> >>>>>> Just for heck of it, I've tried out by adding blk_delete_timer() >>>>>> in >>>>>> the ~/block/blk-exec.c:blk_end_sync_rq() and it seems fixes the >>>>>> problem. >>>>> I think blk_end_sync_rq() is not the good place. From the >>>>> perspective >>>>> of bsg, we need to handle both blk_execute_rq_nowait and >>>>> blk_execute_rq. >>>>> >>>>> >>>>>> Seems like that there are APIs in the block layer that are call >>>>>> the >>>>>> blk_delete_timer(), including, >>>>>> - blk_end_io() >>>>>> - __blk_end_request() >>>>>> >>>>>> Could you guide me what is right way to fix the problem? >>>>> Exporting blk_delete_timer is one option, but it doesn't look very >>>>> nice (since the block layer doesn't export any details about its >>>>> timer >>>>> infrastructure), I think. Modifying blk_end_io() to make it >>>>> usable for >>>>> requests via something like bsg might be better. >>>>> >>>>> Anyway, we need to ask Jens. >>>>> >>>>> Jens, fc people have working on fc pass through support via bsg, >>>>> which >>>>> hooks bsg's request queue on fc transport objects (We did the >>>>> similar >>>>> thing for sas transport). >>>>> >>>>> We want the timeout feature for fc pass through and I think that >>>>> it's >>>>> nice to use the block layer timeout feature for it. But the users >>>>> of >>>>> bsg request queue don't need (or call) APIs such as >>>>> end_that_request_last to call blk_delete_timer internally. How >>>>> should >>>>> these users call blk_delete_timer? >>>> TOMO Hi >>>> If a command is queued by bsg to a scsi device, which is posible. >>>> Then >>>> blk_end_request() is called by scsi-ml. So it does work. >>> It doesn't work for bsg's scsi transport pass through stuff such as >>> SMP (sas management protocol, we already support) and FC. Virtually, >>> they don't use scsi-ml. >>> >> Right, I know that, that's why I say. >> >>>> I think that all block-queue consumers should call one of >>>> blk_end_request(), >>> This is kinda what I suggested in the previous mail but as I wrote, >>> some of them don't now. >>> >> I think they should, specially if they're going to use the timer. >> The way I see it they must. It's kind of a block layer API thing. >> Someone calls blk_execute_xx then eventually someone needs to call >> blk_end_request. You could call it from bsg but only temporary until >> all are fixed. (because you will need an ugly check to see if request >> was not already ended) > I made following changes but, it seems not helpful for the issue. > It, eventually, got failed to call blk_delete_timer() as ~/block/blk- > core.c:__end_that_request_first() returns non-zero. > Inside of the __end_that_reqeust_first(), it detected 'nbytes' is > bigger than 'nr_bytes' in case of bidi (where req->next_rq is not NULL). > I'm not sure whether we need to have chains of function calls > initiated by the blk_end_request() or blk_end_bidi_request(). > Would it create any problems if we directly call 'blk_delete_timer()'? > Dear Seokmann. You miss understud me. What I'm saying is that you must call blk_end_bidi_request at the FC end, just after you have finished to consume the request, and before you return it upstream. it can be some thing like: + blk_end_bidi_request(rq, 0, blk_rq_bytes(rq), + rq->next_rq ? blk_rq_bytes(rq->next_rq) : 0); In this case __end_that_reqeust_first should never return non-zero. > Seokmann > --- > diff -Naurp a/blk-exec.c b/blk-exec.c > --- a/blk-exec.c 2008-10-27 09:33:14.000000000 -0700 > +++ b/blk-exec.c 2008-10-27 09:34:08.000000000 -0700 > @@ -22,6 +22,13 @@ static void blk_end_sync_rq(struct reque > { > struct completion *waiting = rq->end_io_data; > > + if (rq->next_rq) { > + blk_end_bidi_request(rq, error, rq->data_len, > + rq->next_rq->data_len); > + // blk_end_request(rq); > + // blk_delete_timer(rq); > + } > + > rq->end_io_data = NULL; > __blk_put_request(rq->q, rq); > --- Please don't change blk_end_sync_rq at blk-exec.c it is too delicate here. At this stage rq->data_len is already holding the residual count in most cases, and touching it would be a bug >> >>>> there are lots to choose from. We don't need >>>> a new API. It will work with or without data, and it does what >>>> you want. >> Boaz > Do you have this code on a public git-web somewhere? I need to look at the complete code, if you need that I advise you where in the FC code to call blk_end_bidi_request() Thanks Boaz -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html