On Fri, May 12, 2023 at 10:15:18AM +0800, Rongguang Wei wrote: > From: Rongguang Wei <weirongguang@xxxxxxxxxx> > > pciehp's behavior is incorrect if the Attention Button is pressed > on an unoccupied slot. > > When a Presence Detect Changed event has occurred, the slot status > in either BLINKINGOFF_STATE or OFF_STATE, turn it off unconditionally. > But if the slot status is in BLINKINGON_STATE and the slot is currently > empty, the slot status was staying in BLINKINGON_STATE. Thanks for the patch! I don't quite follow the events here. I think the current behavior is this (tell me if I'm going wrong): - Slot is empty (OFF_STATE). - User presses Attention Button. pciehp_handle_button_press() sets state to BLINKINGON_STATE, sets power indicator to blinking, schedules pciehp_queue_pushbutton_work() to turn on power after 5 seconds. - When pciehp_queue_pushbutton_work() runs 5 seconds later, it synthesizes a PCI_EXP_SLTSTA_PDC event and wakes the IRQ thread. - The IRQ thread (pciehp_ist()) calls pciehp_handle_presence_or_link_change(), which does nothing since the slot is in BLINKINGON_STATE, the slot is empty, and the link is not active. - Slot incorrectly remains in BLINKINGON_STATE and power indicator remains blinking. And this patch changes pciehp_handle_presence_or_link_change() so that if the slot is empty, the link is not acive, and the slot is in BLINKINGON_STATE, we put it in OFF_STATE, cancel the delayed work, and turn off the power indicator. After this patch, the user experience is this: - Slot is empty (OFF_STATE). - User presses Attention Button. - Power indicator blinks for 5 seconds. - Power indicator turns off. which definitely seems better. I'm curious why we want the 5 seconds of blinking power indicator at all. We can't really do anything in response to an Attention Button on an empty slot, so could we just ignore it completely in pciehp_handle_button_press()? IIUC, this patch leads to messages like these, which are slightly confusing because we say we're powering up the slot, then later decide "oops, there's nothing here, never mind" (or, I guess the user could push the button, *then* insert the card, and we would power it up, which seems a little sketchy): [ 0.000] pcieport 0000:00:01.5: pciehp: Slot(0-5): Attention button pressed [ 0.001] pcieport 0000:00:01.5: pciehp: Slot(0-5): Powering on due to button press [ 5.001] pcieport 0000:00:01.5: pciehp: Slot(0-5): Card not present Is there a spec that covers the user experience of this case? The closest I could find are SHPC r1.0, sec 2.5, and PCIe r6.0, sec 6.7.1.5. Both mention the 5-second abort interval with the power indicator blinking, but they implicitly assume the slot is occupied. Neither mentions the empty slot case. > The message print like this: > pcieport 0000:00:01.5: pciehp: Slot(0-5): Attention button pressed > pcieport 0000:00:01.5: pciehp: Slot(0-5) Powering on due to button press > pcieport 0000:00:01.5: pciehp: Slot(0-5): Attention button pressed > pcieport 0000:00:01.5: pciehp: Slot(0-5): Button cancel > pcieport 0000:00:01.5: pciehp: Slot(0-5): Action canceled due to button press > > It cause the next Attention Button Pressed event become Button cancel > and missing the Presence Detect Changed event with this button press > though this button presses event is occurred after 5s. It seems like the problem ("empty slot staying in BLINKINGON_STATE forever after one Attention Button event") only requires one button press. If so, why do we talk about the *next* button press here? > According to the Commit d331710ea78f ("PCI: pciehp: Become resilient > to missed events"), if the slot is currently occupied, turn it on and > if the slot is empty, it need to set in OFF_STATE rather than stay in > current status when pciehp_handle_presence_or_link_change() bails out. > > Fixes: d331710ea78f ("PCI: pciehp: Become resilient to missed events") > Link: https://lore.kernel.org/linux-pci/20230403054619.19163-1-clementwei90@xxxxxxx/ > Link: https://lore.kernel.org/linux-pci/20230421025641.655991-1-clementwei90@xxxxxxx/ > Suggested-by: Lukas Wunner <lukas@xxxxxxxxx> > Signed-off-by: Rongguang Wei <weirongguang@xxxxxxxxxx> > --- > drivers/pci/hotplug/pciehp_ctrl.c | 8 ++++++++ > 1 file changed, 8 insertions(+) > > diff --git a/drivers/pci/hotplug/pciehp_ctrl.c b/drivers/pci/hotplug/pciehp_ctrl.c > index 529c34808440..32baba1b7f13 100644 > --- a/drivers/pci/hotplug/pciehp_ctrl.c > +++ b/drivers/pci/hotplug/pciehp_ctrl.c > @@ -256,6 +256,14 @@ void pciehp_handle_presence_or_link_change(struct controller *ctrl, u32 events) > present = pciehp_card_present(ctrl); > link_active = pciehp_check_link_active(ctrl); > if (present <= 0 && link_active <= 0) { > + if (ctrl->state == BLINKINGON_STATE) { > + ctrl->state = OFF_STATE; > + cancel_delayed_work(&ctrl->button_work); > + pciehp_set_indicators(ctrl, PCI_EXP_SLTCTL_PWR_IND_OFF, > + INDICATOR_NOOP); > + ctrl_info(ctrl, "Slot(%s): Card not present\n", > + slot_name(ctrl)); > + } > mutex_unlock(&ctrl->state_lock); > return; > }