On Wed, 2013-11-06 at 07:54 +0100, Hannes Reinecke wrote: > On 11/05/2013 05:02 PM, Frank Mayhar wrote: > > This is the patch submitted by Jun'ichi Nomura, originally based on > > Mike's patch with some small changes by me. Jun'ichi's description > > follows, along with my changes: > > > > On Tue, 2013-11-05 at 07:18 -0800, Frank Mayhar wrote: > >> On Fri, 2013-11-01 at 04:17 +0000, Junichi Nomura wrote: > >>> I slightly modified the patch: > >>> - fixed the timeout handler to correctly find > >>> clone request and "struct multipath" > >>> - the timeout handler just disables "queue_if_no_path" > >>> instead of killing the request directly > >>> - "dmsetup status" to show the parameter > >>> - changed an interface between dm core and target > >>> - added some debugging printk (you can remove them) > >>> and checked the timeout occurs, at least. > >>> > >>> I'm not sure if this feature is good or not though. > >>> (The timer behavior is not intuitive, I think) > >> Thanks! I integrated your new patch and tested it. Sure enough, it > >> seems to work. I've made a few tweaks (added a module tunable and > >> support for setting the timer in multipath_message(), removed your debug > >> printks) and will submit the modified patch for discussion shortly. > > > > Comments? > > > Yeah. Seems to be my eternal fate; initiating fixes and not getting > mentioned at all. > Sigh. > > I dimly remember having sent the original patch for the blk timeout > function ... hence a short notice would've been nice. Sorry, I did ding you early on (and I think Mike dinged you as well), but you were apparently busy with other things at the time. I also sent email last Thursday, quoted below: On Wed, 2013-10-30 at 11:43 -0400, Mike Snitzer wrote: > On Wed, Oct 30 2013 at 11:08am -0400, > Frank Mayhar <fmayhar@xxxxxxxxxx> wrote: > > > On Tue, 2013-10-29 at 21:02 -0400, Mike Snitzer wrote: > > > Any interest in this or should I just table it for >= v3.14? > > > > Sorry, I've been busy putting out another fire. Yes, there's definitely > > still interest. I grabbed your revised patch and tested with it. > > Unfortunately the timeout doesn't actually fire when requests are queued > > due to queue_if_no_path; IIRC the block request queue timeout logic > > wasn't triggering. I planned to look into it more deeply figure out why > > but I had to spend all last week fixing a nasty race and hadn't gotten > > back to it yet. > > OK, Hannes, any idea why this might be happening? The patch in question > is here: https://patchwork.kernel.org/patch/3070391/ I got to this today and so far the most interesting I see is that the cloned request that's queued in multipath has no queue associated with it when it's queued; a printk reveals: [ 517.610042] map_io: queueing rq ffff8801150e0070 q (null) When it's eventually dequeued, it gets a queue from the destination device (in the pgpath) via bdev_get_queue(). Because of this and from just looking at the code, blk_start_request() (and therefore blk_add_timer()) isn't being called for those requests, so there's never a chance that the timeout would happen. Does this make sense? Or am I totally off-base? -- Frank Mayhar 310-460-4042 -- dm-devel mailing list dm-devel@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/dm-devel