Re: [patch 3/6] 2.6.18: sb1250-mac: Phylib IRQ handling fixes

Andy Fleming <afleming@xxxxxxxxxxxxx> · Fri, 20 Oct 2006 16:40:20 -0500

On Oct 3, 2006, at 10:18, Maciej W. Rozycki wrote:

Hello,

 This patch fixes a couple of problems discovered with interrupt  
handling
in the phylib core, namely:

1. The driver uses timer and workqueue calls, but does not include
   <linux/timer.h> nor <linux/workqueue.h>.


Good catch.



2. The driver uses schedule_work() for handling interrupts, but  
does not
   make sure any pending work scheduled thus has been completed before
   driver's structures get freed from memory.  This is especially
   important as interrupts may keep arriving if the line is shared  
with
   another PHY.

   The solution is to ignore phy_interrupt() calls if the reported  
device
   has already been halted and calling flush_scheduled_work() from
   phy_stop_interrupts() (but guarded with current_is_keventd() in  
case
   the function has been called through keventd from the MAC device's
   close call to avoid a deadlock on the netlink lock).


I've been trying to figure out this problem since you posted this,  
and I'm not sure I understand it fully (And I apologize profusely for  
the horror that is the PHY interrupt handling code.  I'd love to  
rewrite it if there's some cleaner way.).  But let me see if I can  
follow the chain of reasoning that led to this patch, and see if we  
can figure out a solution that doesn't involve creating a work queue  
just for bringing down the PHY.

1) Invoking phy_stop is meant to stop the system from looking for PHY  
status updates.  Currently, another PHY sharing the interrupt can  
cause the HALTED PHY to reenable interrupts.  Checking for HALTED in  
the interrupt handler fixes this, but it's incorrect.  The  
phy_interrupt handler does not grab the lock, and so you could get this:

phy_stop
	lock
	clear any pending interrupts
	disable interrupts on this PHY
---> interrupt from another PHY causes this PHY's interrupt handler  
to be invoked
	HALTED isn't set, so phy_change is scheduled
<--- set HALTED, unlock

scheduled work is done:
	interrupt is reenabled

Sadly, I think the only way to fix this problem is to have phy_change  
check for HALTED, and react appropriately.

2) The PHY lib doesn't clear out remaining work in the work queue  
when it's bringing down a PHY.  This is clearly wrong, but I'm  
confused how *any* driver does this?  It seems to me that any network  
driver which has a work queue is going to be unable to flush the  
pending work when it is brought down.  So what's the solution to this?

I'm not too enthusiastic about requiring the ethernet drivers to call  
phy_disconnect in a separate thread after "close" is called.   
Assuming there's not some sort of "squash work queue" function that  
can be invoked with rtnl_lock held, I think phy_disconnect should  
schedule itself to flush the queue.  This would also require that  
mdiobus_unregister hold off on freeing phydevs if any of the phys  
were still waiting for pending flush_pending calls to finish.  Which  
would, in turn, require mdiobus_unregister to schedule cleaning up  
memory for some later time.

I'm not enthusiastic about that implementation, either, but it  
maintains the abstractions I consider important for this code.  The  
ethernet driver should not need to know what structures the PHY lib  
uses to implement its interrupt handling, and how to work around  
their failings, IMHO.

And now, to be more specific, a few comments on the patch, directly:


 Please consider.

  Maciej

Signed-off-by: Maciej W. Rozycki <macro@xxxxxxxxxxxxxx>

patch-mips-2.6.18-20060920-phy-irq-16
diff -up --recursive --new-file linux-mips-2.6.18-20060920.macro/ 
drivers/net/phy/phy.c linux-mips-2.6.18-20060920/drivers/net/phy/phy.c

--- linux-mips-2.6.18-20060920.macro/drivers/net/phy/phy.c	 
2006-08-05 04:58:46.000000000 +0000
+++ linux-mips-2.6.18-20060920/drivers/net/phy/phy.c	2006-10-03  
14:19:21.000000000 +0000
@@ -7,6 +7,7 @@
  * Author: Andy Fleming
  *
  * Copyright (c) 2004 Freescale Semiconductor, Inc.
+ * Copyright (c) 2006  Maciej W. Rozycki
  *
  * This program is free software; you can redistribute  it and/or  
modify it
  * under  the terms of  the GNU General  Public License as  
published by the
@@ -32,6 +33,8 @@
 #include <linux/mii.h>
 #include <linux/ethtool.h>
 #include <linux/phy.h>
+#include <linux/timer.h>
+#include <linux/workqueue.h>

 #include <asm/io.h>
 #include <asm/irq.h>
@@ -484,6 +487,9 @@ static irqreturn_t phy_interrupt(int irq
 {
 	struct phy_device *phydev = phy_dat;

+	if (PHY_HALTED == phydev->state)
+		return IRQ_NONE;		/* It can't be ours.  */
+


As I mentioned, this doesn't protect it, since it doesn't grab the  
lock.  And it can't grab the lock, or we'd have to disable interrupts  
while doing phy transactions.  And we can't do that, because one  
design goal is to allow some bus drivers to use interrupts to signal  
that the transaction has completed.  Admittedly, this is still quite  
broken right now.  I'm looking into using semaphores, on the theory  
that I can sleep when I grab them.  But that would still prevent  
taking the semaphore in the interrupt controller.  This needs to be  
moved to phy_change (which you have done, anyway), and we just have  
to let the actual handler figure out whether it's safe to do anything.


 	/* The MDIO bus is not allowed to be written in interrupt
 	 * context, so we need to disable the irq here.  A work
 	 * queue will write the PHY to disable and clear the
@@ -577,6 +583,13 @@ int phy_stop_interrupts(struct phy_devic
 	if (err)
 		phy_error(phydev);

+	/*
+	 * Finish any pending work; we might have been scheduled
+	 * to be called from keventd ourselves, though.
+	 */
+	if (!current_is_keventd())
+		flush_scheduled_work();
+


And this is what is making you move your call to phy_disconnect to a  
work queue function, right?  Does this make it so phy_stop_interrupts  
(and anything that calls it) can't be called with rtnl_lock held?   
I'd like to avoid that requirement, if at all possible.


 	free_irq(phydev->irq, phydev);

 	return err;
@@ -603,7 +616,8 @@ static void phy_change(void *data)
 	enable_irq(phydev->irq);

 	/* Reenable interrupts */
-	err = phy_config_interrupt(phydev, PHY_INTERRUPT_ENABLED);
+	if (PHY_HALTED != phydev->state)
+		err = phy_config_interrupt(phydev, PHY_INTERRUPT_ENABLED);


Eek!  You're checking the phydev->state outside of a lock.  That's  
very unsafe.



 	if (err)
 		goto irq_enable_err;
@@ -624,18 +638,24 @@ void phy_stop(struct phy_device *phydev)
 	if (PHY_HALTED == phydev->state)
 		goto out_unlock;

-	if (phydev->irq != PHY_POLL) {
-		/* Clear any pending interrupts */
-		phy_clear_interrupt(phydev);
+	phydev->state = PHY_HALTED;

+	if (phydev->irq != PHY_POLL) {
 		/* Disable PHY Interrupts */
 		phy_config_interrupt(phydev, PHY_INTERRUPT_DISABLED);
-	}

-	phydev->state = PHY_HALTED;
+		/* Clear any pending interrupts */
+		phy_clear_interrupt(phydev);
+	}


This all seems good.  I was just thinking a few minutes ago that  
disabling the interrupts should be done before clearing the  
interrupts.  And setting HALTED first should mean that we can move  
the unlock call up.



 out_unlock:
 	spin_unlock(&phydev->lock);
+
+	/*
+	 * Cannot call flush_scheduled_work() here as desired because
+	 * of rtnl_lock(), but PHY_HALTED shall guarantee phy_change()
+	 * will not reenable interrupts.
+	 */


Yeah, I don't think we need the comment, necessarily; phy_change will  
be protected (as long as the HALTED check is moved inside a lock).

Andy