Re: [PATCH 3/3] ide: use per-device request queue locks

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



[ Once again, sorry for the long delay. ]
On Monday 24 November 2008, Elias Oltmanns wrote:> Bartlomiej Zolnierkiewicz <bzolnier@xxxxxxxxx> wrote:> > * Move hack for flush requests from choose_drive() to do_ide_request().> >> > * Add ide_plug_device() helper and convert core IDE code from using> >   per-hwgroup lock as a request lock to use the ->queue_lock instead.> >> > * Remove no longer needed:> >   - choose_drive() function> >   - WAKEUP() macro> >   - 'sleeping' flag from ide_hwif_t> >   - 'service_{start,time}' fields from ide_drive_t> >> > This patch results in much simpler and more maintainable code> > (besides being a scalability improvement).> >> > Cc: Elias Oltmanns <eo@xxxxxxxxxxxxxx>> > Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@xxxxxxxxx>> > ---> > newer version> > Eventually, I got round to having a look at this one (I reviewed the two> small ones at the time). Apologies for the delay.> > >> >  drivers/ide/ide-io.c    | 213 +++++++++++++++---------------------------------> >  drivers/ide/ide-park.c  |   13 +-> >  drivers/ide/ide-probe.c |    3 > >  include/linux/ide.h     |    4 > >  4 files changed, 79 insertions(+), 154 deletions(-)> >> > Index: b/drivers/ide/ide-io.c> > ===================================================================> > --- a/drivers/ide/ide-io.c> > +++ b/drivers/ide/ide-io.c> [...]> > @@ -780,61 +704,40 @@ repeat:	> >   */> >  void do_ide_request(struct request_queue *q)> >  {> > -	ide_drive_t	*orig_drive = q->queuedata;> > -	ide_hwgroup_t	*hwgroup = orig_drive->hwif->hwgroup;> > -	ide_drive_t	*drive;> > -	ide_hwif_t	*hwif;> > +	ide_drive_t	*drive = q->queuedata;> > +	ide_hwif_t	*hwif = drive->hwif;> > +	ide_hwgroup_t	*hwgroup = hwif->hwgroup;> >  	struct request	*rq;> >  	ide_startstop_t	startstop;> >  > > -	/* caller must own hwgroup->lock */> > -	BUG_ON(!irqs_disabled());> > -> > -	while (!ide_lock_hwgroup(hwgroup)) {> > -		drive = choose_drive(hwgroup);> > -		if (drive == NULL) {> > -			int sleeping = 0;> > -			unsigned long sleep = 0; /* shut up, gcc */> > -			hwgroup->rq = NULL;> > -			drive = hwgroup->drive;> > -			do {> > -				if ((drive->dev_flags & IDE_DFLAG_SLEEPING) &&> > -				    (sleeping == 0 ||> > -				     time_before(drive->sleep, sleep))) {> > -					sleeping = 1;> > -					sleep = drive->sleep;> > -				}> > -			} while ((drive = drive->next) != hwgroup->drive);> > -			if (sleeping) {> > +	/*> > +	 * drive is doing pre-flush, ordered write, post-flush sequence. even> > +	 * though that is 3 requests, it must be seen as a single transaction.> > +	 * we must not preempt this drive until that is complete> > +	 */> > +	if (blk_queue_flushing(q))> >  		/*> > -		 * Take a short snooze, and then wake up this hwgroup again.> > -		 * This gives other hwgroups on the same a chance to> > -		 * play fairly with us, just in case there are big differences> > -		 * in relative throughputs.. don't want to hog the cpu too much.> > +		 * small race where queue could get replugged during> > +		 * the 3-request flush cycle, just yank the plug since> > +		 * we want it to finish asap> >  		 */> > -				if (time_before(sleep, jiffies + WAIT_MIN_SLEEP))> > -					sleep = jiffies + WAIT_MIN_SLEEP;> > -#if 1> > -				if (timer_pending(&hwgroup->timer))> > -					printk(KERN_CRIT "ide_set_handler: timer already active\n");> > -#endif> > -				/* so that ide_timer_expiry knows what to do */> > -				hwgroup->sleeping = 1;> > -				hwgroup->req_gen_timer = hwgroup->req_gen;> > -				mod_timer(&hwgroup->timer, sleep);> > -				/* we purposely leave hwgroup locked> > -				 * while sleeping */> > -			} else> > -				ide_unlock_hwgroup(hwgroup);> > +		blk_remove_plug(q);> > I'm not at all convinced that this works as expected. First of all, I> think we can safely assume that the plug is removed when block layer> calls into the ->request_fn(). Secondly, since the ide layer doesn't> call the ->request_fn() on it's own accord, I rather suspect that this> check can be dropped altogether. On the other hand, I'm not sure I agree
I suspect that this is leftover from the old code and I also think thatit can be removed completely.  However mixing too many real code changesin a single patch is not a good practice and the removal can be handledindependently of the discussed patch.
If you would send me a patch with the above change I will be happy tointegrate it into pata tree (patch can be against current Linus' tree orlinux-next, either one is completely fine with me).
> with the rest of the patch anyway, see below.> > >  > > -			/* no more work for this hwgroup (for now) */> > -			goto plug_device;> > -		}> > +	spin_unlock_irq(q->queue_lock);> > +	spin_lock_irq(&hwgroup->lock);> >  > > -		if (drive != orig_drive)> > -			goto plug_device;> > +	/* caller must own hwgroup->lock */> > +	BUG_ON(!irqs_disabled());> > Does this check really make any sense? The lock is being taken just two> lines before, after all.
Indeed, removed.
> > -		hwif = drive->hwif;> > +	if (!ide_lock_hwgroup(hwgroup)) {> > +		hwgroup->rq = NULL;> > +> > +		if (drive->dev_flags & IDE_DFLAG_SLEEPING) {> > +			if (time_before(drive->sleep, jiffies)) {> > +				ide_unlock_hwgroup(hwgroup);> > +				goto plug_device;> > +			}> > +		}> >  > >  		if (hwif != hwgroup->hwif) {> >  			/*> > @@ -847,16 +750,20 @@ void do_ide_request(struct request_queue> >  		hwgroup->hwif = hwif;> >  		hwgroup->drive = drive;> >  		drive->dev_flags &= ~(IDE_DFLAG_SLEEPING | IDE_DFLAG_PARKED);> > -		drive->service_start = jiffies;> >  > > +		spin_unlock_irq(&hwgroup->lock);> > +		spin_lock_irq(q->queue_lock);> >  		/*> >  		 * we know that the queue isn't empty, but this can happen> >  		 * if the q->prep_rq_fn() decides to kill a request> >  		 */> >  		rq = elv_next_request(drive->queue);> > +		spin_unlock_irq(q->queue_lock);> > +		spin_lock_irq(&hwgroup->lock);> > +> >  		if (!rq) {> >  			ide_unlock_hwgroup(hwgroup);> > -			break;> > +			goto out;> >  		}> >  > >  		/*> > @@ -888,15 +795,22 @@ void do_ide_request(struct request_queue> >  > >  		if (startstop == ide_stopped) {> >  			ide_unlock_hwgroup(hwgroup);> > -			if (!elv_queue_empty(orig_drive->queue))> > -				blk_plug_device(orig_drive->queue);> > +			/* give other devices a chance */> > +			goto plug_device;> > This, I think, is very wrong. The rule of thumb for a ->requst_fn()> should be to take as many requests off the queue as possible. Besides,> this may easily preempt an ordered sequence as mentioned earlier. In my> opinion, we really need a loop of sorts (while or goto).
Thanks for noticing this.  Fixed.
[ The original idea behind "goto plug_device" is in the comment but as  I read the code again it doesn't make any sense now... ]
> >  		}> > -	}> > +	} else> > +		goto plug_device;> > +out:> > +	spin_unlock_irq(&hwgroup->lock);> > +	spin_lock_irq(q->queue_lock);> >  	return;> >  > >  plug_device:> > -	if (!elv_queue_empty(orig_drive->queue))> > -		blk_plug_device(orig_drive->queue);> > +	spin_unlock_irq(&hwgroup->lock);> > +	spin_lock_irq(q->queue_lock);> > +> > +	if (!elv_queue_empty(q))> > +		blk_plug_device(q);> >  }> >  > >  /*> [...]> > @@ -974,10 +899,12 @@ out:> >  void ide_timer_expiry (unsigned long data)> >  {> >  	ide_hwgroup_t	*hwgroup = (ide_hwgroup_t *) data;> > +	ide_drive_t	*uninitialized_var(drive);> >  	ide_handler_t	*handler;> >  	ide_expiry_t	*expiry;> >  	unsigned long	flags;> >  	unsigned long	wait = -1;> > +	int		plug_device = 0;> >  > >  	spin_lock_irqsave(&hwgroup->lock, flags);> >  > > @@ -989,12 +916,8 @@ void ide_timer_expiry (unsigned long dat> >  		 * or we were "sleeping" to give other devices a chance.> >  		 * Either way, we don't really want to complain about anything.> >  		 */> > Not that important, I suppose, but you might want to update that comment> while you're at it.
I think that despite code changes it stays valid so there is no urgent needto touch it..
> > -		if (hwgroup->sleeping) {> > -			hwgroup->sleeping = 0;> > -			ide_unlock_hwgroup(hwgroup);> > -		}> >  	} else {> > -		ide_drive_t *drive = hwgroup->drive;> > +		drive = hwgroup->drive;> >  		if (!drive) {> >  			printk(KERN_ERR "ide_timer_expiry: hwgroup->drive was NULL\n");> >  			hwgroup->handler = NULL;> > Finally, some more general blathering on the topic at hand: A while ago,> I spent some thought on the possibilities of giving the block layer a> notion of linked device queues as an equivalent hwgroups in ide,> scsi_hosts or ata_ports and let it take care of time / bandwidth> distribution among the queues belonging to one group. This is, as I> understand, pretty much what your code is relying on since you have> chucked out choose_drive(). However, this turned out not to be too easy
This is the right way to go and I has always advocated for it.  Howeverafter seeing how libata got away with ignoring the issue altogetherI'm no longer sure of this.  I haven't seen any bug reports which wouldindicate that simplified approach has any really negative consequences.
[ Still would be great to have the control over bandwitch of "queue-group"  at the block layer level since we could also use it for distributing the  available PCI[-E] bus bandwitch. ]
> and I'm not quite sure whether we really want to go down that road. One> major problem is that there is no straight forward way for the block> layer to know, whether a ->request_fn() has actually taken a request off> the queue and if not (or less than queue_depth anyway), whether it's> just the device that couldn't take any more or the controller instead.> On the whole, it seems not exactly trivial to get it right and it would> probably be a good idea to consult Jens and perhaps James before
I think that having more information returned by ->request_fn() could bebeneficial to the block layer (independently whether we end up addingsupport for "queue-groups" to the block layer or not) but this definitelyneeds to be verified with Jens & James.
> embarking on such a venture. Short of that, I think that ide layer has> to keep an appropriate equivalent of choose_drive() and also the while> loop in the do_ide_request() function.
Thank you for your review.  v1->v2 interdiff below.
v2:* Fixes/improvements based on review from Elias: - take as many requests off the queue as possible - remove now redundant BUG_ON()
diff -u b/drivers/ide/ide-io.c b/drivers/ide/ide-io.c--- b/drivers/ide/ide-io.c+++ b/drivers/ide/ide-io.c@@ -726,10 +726,8 @@ 	spin_unlock_irq(q->queue_lock); 	spin_lock_irq(&hwgroup->lock); -	/* caller must own hwgroup->lock */-	BUG_ON(!irqs_disabled());- 	if (!ide_lock_hwgroup(hwgroup)) {+repeat: 		hwgroup->rq = NULL;  		if (drive->dev_flags & IDE_DFLAG_SLEEPING) {@@ -793,11 +791,8 @@ 		startstop = start_request(drive, rq); 		spin_lock_irq(&hwgroup->lock); -		if (startstop == ide_stopped) {-			ide_unlock_hwgroup(hwgroup);-			/* give other devices a chance */-			goto plug_device;-		}+		if (startstop == ide_stopped)+			goto repeat; 	} else 		goto plug_device; out:ÿôèº{.nÇ+?·?®?­?+%?Ëÿ±éݶ¥?wÿº{.nÇ+?·¥?{±þ'^þ)í?æèw*jg¬±¨¶????Ý¢jÿ¾«þG«?éÿ¢¸¢·¦j:+v?¨?wèjØm¶?ÿþø¯ù®w¥þ?àþf£¢·h??â?úÿ?Ù¥


[Index of Archives]     [Linux Filesystems]     [Linux SCSI]     [Linux RAID]     [Git]     [Kernel Newbies]     [Linux Newbie]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Samba]     [Device Mapper]

  Powered by Linux