[+ cc: gregkh and bottom post] On Thu, 23 Jan 2014 00:53:35 +0300 Dan Carpenter <dan.carpenter@xxxxxxxxxx> wrote: > Ok. I've added the smatch mailing list. And I'm top posting because > I am a bad person. > > It would be better if I could look at an existing patch for one of these > things. > > regards, > dan carpenter > > On Wed, Jan 22, 2014 at 01:28:29PM -0500, Joe Lawrence wrote: > > Hi Dan, > > > > (Feel free to CC the smatch list if you think it's appropriate.) > > > > Part of the work we do at Stratus is hardening various device drivers > > for our HW platform against surprise PCI hotplug removal. Even though > > it has been documented in Linux Device Drivers for years, we still see > > GregKH occasionally explaining to folks that on surprise hotplug, PCI > > drivers "will suddenly start reading all 0xff and will then need to > > abort whatever it was doing." > > > > For example, the code provided in section 14.7.1. Dynamic Devices of > > LDD: > > > > result = readl(ptr); > > if (result = = ~(u32)0) /* card removed */ > > return -ENODEV; > > > > Currently we manually inspect code drops for unsafe reads of PCI > > space. We have essentially two cases of PCI reads: > > > > Case 1 - Read data doesn't matter - driver reads from the device, then > > writes something back. Although the device has gone missing, we > > usually consider this as benign as writing to missing HW has no effect. > > > > Case 2 - Read data affects control flow or kernel/driver data - driver > > reads data and then bases execution on the result, or read data is > > copied into a kernel/driver data structure. Typically we check these > > reads and if necessary, twiddle enough state to get the driver into its > > error handler. > > > > Currently we manually inspect new code drops for potentially > > hotplug-unsafe reads. I was wondering if you thought it would be worth > > adding a check in smatch, even as an optional flag. There is probably > > already an existing pattern that we could mimic to add this. As I > > don't hack on the smatch code itself, I was wondering if you could > > point that out. (I can go off and try implementing / testing.) > > > > A means of automating this code audit would be helpful to us here at > > Stratus and perhaps the kernel community in general. > > > > Thanks for any suggestions, > > > > -- Joe Hi Dan, First, I should have mentioned that Stratus inspects PCI reads for two reasons: 1 - Handle hotplug removed devices, ensure data/program flow integrity. (As mentioned in my first mail.) 2 - Find missing devices as soon as possible so that we can failover to a devices' pair to provide availability. (Think of ethernet interface bonding and MD RAID1 pairing of HBAs.) To ensure reason (2), Stratus adds "hw_gone" flags and other checks to expedite device removal, but I want to set that aside to keep it simple. With that mind, our changes probably go farther than what the linux kernel community at large would be interested in. However, it may still be beneficial for driver writers to have a static code analyzer looking over their shoulder for reason (1). In the ixgbe driver, we verify a reasonable return value from readl like so (in pseudo-diff format): #define IXGBE_READ_REG(a, reg) readl((a)->hw_addr + (reg)) #define IXGBE_STATUS 0x00008 /* General Register */ #define IXGBE_EEC_FLUDONE 0x04000000 /* Flash update done */ static s32 ixgbe_poll_flash_update_done_X540(struct ixgbe_hw *hw) { u32 i; u32 reg; s32 status = IXGBE_ERR_EEPROM; for (i = 0; i < IXGBE_FLUDONE_ATTEMPTS; i++) { reg = IXGBE_READ_REG(hw, IXGBE_EEC); + if (unlikely(reg == 0xffffffff) && + IXGBE_READ_REG(hw, IXGBE_STATUS) == 0xffffffff) + break; if (reg & IXGBE_EEC_FLUDONE) { status = 0; break; } udelay(5); } return status; } Without checking for a missing device read value of 0xffffffff, the logic could follow the (reg & IXGBE_EEC_FLUDONE) branch and ixgbe_poll_flash_update_done_X540's caller would assume success. For some reads, a value of ~0 may be valid, so we couple the check with a read of something like a general status register (where ~0 really means no device) to be sure. On the other hand, there are PCI reads that Stratus doesn't verify, for their side effect doesn't affect reason (1) or reason (2). Again in the ixgbe driver: static s32 ixgbe_blink_led_start_X540(struct ixgbe_hw *hw, u32 index) { u32 macc_reg; u32 ledctl_reg; ixgbe_link_speed speed; bool link_up; /* * Link should be up in order for the blink bit in the LED control * register to work. Force link and speed in the MAC if link is down. * This will be reversed when we stop the blinking. */ hw->mac.ops.check_link(hw, &speed, &link_up, false); if (!link_up) { macc_reg = IXGBE_READ_REG(hw, IXGBE_MACC); macc_reg |= IXGBE_MACC_FLU | IXGBE_MACC_FSV_10G | IXGBE_MACC_FS; IXGBE_WRITE_REG(hw, IXGBE_MACC, macc_reg); } /* Set the LED to LINK_UP + BLINK. */ ledctl_reg = IXGBE_READ_REG(hw, IXGBE_LEDCTL); ledctl_reg &= ~IXGBE_LED_MODE_MASK(index); ledctl_reg |= IXGBE_LED_BLINK(index); IXGBE_WRITE_REG(hw, IXGBE_LEDCTL, ledctl_reg); IXGBE_WRITE_FLUSH(hw); return 0; } The reads into macc_reg and ledctl_reg we don't verify since they are only used to calculate new values written back to the same device. If this device was hotplug removed, the writes would go nowhere (this is okay). Some other read or a device watchdog would catch the missing device and handle accordingly. In our modified drivers, Stratus annotates every "checked" read and every "don't care" read. The "checked" reads will try to as gracefully as possible invoke an error status/handler to escape its call chain without causing any collateral damage. The "don't care" reads are marked so that we know it's been audited. While giving smatch a spin on our drivers, it occurred to me that it might be employed to help us catch any new hotplug-unsafe PCI reads that we may have missed. Depending upon how aggressive the check is, it may be useful for general upstream use as well. Does this sound like something suitable for smatch? Maybe there is an existing pattern that we could base this on. Regards, -- Joe -- To unsubscribe from this list: send the line "unsubscribe smatch" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html