> > > > > > > > On Wed, Aug 17, 2016 at 10:37:13PM +0000, Patel, Mayurkumar wrote: > > > > > > On Wed, Aug 17, 2016 at 10:54:12AM -0700, Rajat Jain wrote: > > > > > > > > > > > We need to do something about that *in addition * to the > > > > > > > above patch to cover the > > > > > > > whole story. However I think there still will be a room for some > > > > > > > interrupt misses because we are > > > > > > > collecting the interrupts in intr_loc, and theoretically we could be > > > > > > > in a situation where in the pcie_isr, the > > > > > > > > > > > > > > do { > > > > > > > ... > > > > > > > } while(detected) > > > > > > > > > > > > > > loop gets a removal->insertion->removal all while in the same > > > > > > > invocation of pcie_isr(). > > > > > > > If this happens, the intr_loc will have recorded a single insertion > > > > > > > and a single removal, and > > > > > > > the final result will depend on the order in which we decide to > > > > > > > process the events in intr_loc. > > > > > > > > > > > > I don't quite understand how that "do { .. } while (detected)" loop > > > > > > works or why it's done that way. Collecting interrupt status bits in > > > > > > an ISR is obviously a very common task; it seems like there should be > > > > > > a standard, idiomatic way of doing it, but I don't know it. > > > > > > > > > > > > > Or, may be we can make the calls to pciehp_queue_interrupt_event() > > > > > > > before clearing the > > > > > > > RW1C in the slot status register (in the loop)? > > > > > > > > > > > > Yeah, it seems like we should read PCI_EXP_SLTSTA once, queue up any > > > > > > events related to it, then clear the relevant SLTSTA bits. > > > > > > > > > > > > > > > > Do you mean to remove the do {...} while loop and just > > > > > read PCI_EXP_SLTSTA once in ISR , queue the work and clear interrupts? > > > > > > > > I don't know if removing the loop is the right thing or not. We need > > > > to understand why the loop is there in the first place and make sure > > > > removing it wouldn't break things. > > > > > > > > > > > > > The pcie base spec. 3.0 says from chapter 6.7.3.1. Slot Events: > > > > > > A Downstream Port with hot-plug capabilities monitors the slot it controls for the slot events listed > > > above. When one of these slot events is detected, the Port indicates that the event has occurred by > > > setting the status field associated with the event. At that point, the event is pending until software > > > clears the status field. > > > Once a slot event is pending on a particular slot, all subsequent events of that type are ignored on > > > that slot until the event is cleared. The Port must continue to monitor the slot for all other slot > > > event types and report them as they occur. > > > > > > > > > So does it mean that the port would continue providing MSI if there has been > > > any other events occurred apart from the event which is not cleared? If that > > > is the case then it's not sure why the loop is still needed. > > > > I'd think that the loop is needed because we don't want to handle just > > one event on one interrupt. We want to handle as many events as we can > > (to keep the interrupt latency overhead low), that have happened > > during the ISR invocation. And hence the loop. > > > > Hmm, Right. I created a proposal patch > (=>[v1,2/2] PCI: pciehp: Rework hotplug interrupt routine) > to remove the do {...} while loop which > I guess then should be rejected, right? As we decide to keep this loop? > > > > > > > Also with having a loop, pcie_isr() is reading the PCI_EXP_SLTSTA again and taking action just based on > > > latest event happens. That may mean removal->insertion happens fast enough then > > > removal could be overwritten by insertion and pciehp_get_adapter_status () > > > will return insertion only and we may miss the removal event to be processed for the loop. > > > > Current. This is the problem that Bjorn highlighted above. I think the > > current code works well when events of different kinds have happened > > (due to the loop). > > > > In case of similar kind of events (insertion / removal), I think there > > are 2 problems: (1) we use the transient values of slot status which > > may not be valid at that time. (2) because we ack the event before we > > queue it up, it leaves chance for error. I think we'd need more > > careful examination in order to fix these issues. > > > > I also see 3rd problem too with this Loop. > For example, if we are in the loop, > If only slot removal comes -> detect = PCI_EXP_SLTSTA_PDC -> intr_lock |= detected > -> detected==true so clear PDC interrupt bit => So we have a chance to detect > another insertion event. > > We are in the same loop & if insertion comes again -> so detected = PCI_EXP_SLTSTA_PDC > -> then detected &= ~intr_loc -> detected == false => This means > PCI_EXP_SLTSTA would not be cleared for insertion event happened > and there we could get a problem that No further PDC events can get triggered by HW. > > So if we want to keep the loop, I think clearing PCI_EXP_SLTSTA should be shifted > outside of the loop to clear Slot events(but not based on detected) to ensure that loop > just looks for the events Which has not occurred, also before clearing SLTSTA outside > the loop, we can update link and slot_status to trigger the right events for DLLSC and PDC types. > > What do you think about following proposal? > > @@ -542,7 +542,7 @@ static irqreturn_t pcie_isr(int irq, void *dev_id) > struct pci_bus *subordinate = pdev->subordinate; > struct pci_dev *dev; > struct slot *slot = ctrl->slot; > - u16 detected, intr_loc; > + u16 detected, intr_loc, slot_status; > u8 present; > bool link; > > @@ -553,26 +553,30 @@ static irqreturn_t pcie_isr(int irq, void *dev_id) > */ > intr_loc = 0; > do { > - pcie_capability_read_word(pdev, PCI_EXP_SLTSTA, &detected); > - if (detected == (u16) ~0) { > + pcie_capability_read_word(pdev, PCI_EXP_SLTSTA, &slot_status); > + if (slot_status == (u16) ~0) { > ctrl_info(ctrl, "%s: no response from device\n", > __func__); > return IRQ_HANDLED; > } > > - detected &= (PCI_EXP_SLTSTA_ABP | PCI_EXP_SLTSTA_PFD | > + detected = slot_status & (PCI_EXP_SLTSTA_ABP | > + PCI_EXP_SLTSTA_PFD | > PCI_EXP_SLTSTA_PDC | > PCI_EXP_SLTSTA_CC | PCI_EXP_SLTSTA_DLLSC); > detected &= ~intr_loc; > intr_loc |= detected; > if (!intr_loc) > return IRQ_NONE; > - if (detected) > - pcie_capability_write_word(pdev, PCI_EXP_SLTSTA, > - intr_loc); > } while (detected); > > ctrl_dbg(ctrl, "pending interrupts %#06x from Slot Status\n", intr_loc); > + if (intr_loc & PCI_EXP_SLTSTA_PDC) > + present = !!(slot_status & PCI_EXP_SLTSTA_PDS); > + if (intr_loc & PCI_EXP_SLTSTA_DLLSC) > + link = pciehp_check_link_active(ctrl); > + > + pcie_capability_write_word(pdev, PCI_EXP_SLTSTA, intr_loc); > > /* Check Command Complete Interrupt Pending */ > if (intr_loc & PCI_EXP_SLTSTA_CC) { > @@ -603,7 +607,6 @@ static irqreturn_t pcie_isr(int irq, void *dev_id) > > /* Check Presence Detect Changed */ > if (intr_loc & PCI_EXP_SLTSTA_PDC) { > - pciehp_get_adapter_status(slot, &present); > ctrl_info(ctrl, "Card %spresent on Slot(%s)\n", > present ? "" : "not ", slot_name(slot)); > pciehp_queue_interrupt_event(slot, present ? INT_PRESENCE_ON : > @@ -618,7 +621,6 @@ static irqreturn_t pcie_isr(int irq, void *dev_id) > } > > if (intr_loc & PCI_EXP_SLTSTA_DLLSC) { > - link = pciehp_check_link_active(ctrl); > ctrl_info(ctrl, "slot(%s): Link %s event\n", > slot_name(slot), link ? "Up" : "Down"); > pciehp_queue_interrupt_event(slot, link ? INT_LINK_UP : > > Hi Rajat/Bjorn Could you please comment, if the proposal can be worked out or we should consider removing do {..} while at all as Keith, Busch commented as that's not really performance path or is there any other way/suggestion you think of? > > > > > > If we don't clear the Slot event quick enough then according to spec. statement above > > > that event could get ignored and SW may never get notified. > > > > > > > > > > > > > But I do think that in the resulting code, the connection between > > > > > > > > (1) the events we learn from the interrupt and > > > > (2) the queued work items > > > > > > > > needs to be crystal clear. Right now it's a bit muddy because of > > > > things like the case you fixed: a work item that goes back and looks > > > > at PCI_EXP_SLTSTA after it's been cleared and the hardware may have > > > > already set bits for new events. > > > > > > > > > > I have made a prototype patch in my follow up reply and tested ok on my existing setup > > > on which I caught the previous issue although I cant say it will work on any HW. > > > Clearing events after queuing events gave some problems and did not work properly on my > > > existing HW where I test very fast insertion and removal events, in which > > > case only "removal" event comes and "insertion" does not occur even if HW gets powered. > > > > Intel Deutschland GmbH Registered Address: Am Campeon 10-12, 85579 Neubiberg, Germany Tel: +49 89 99 8853-0, www.intel.de Managing Directors: Christin Eisenschmid, Christian Lamprechter Chairperson of the Supervisory Board: Nicole Lau Registered Office: Munich Commercial Register: Amtsgericht Muenchen HRB 186928 ��.n��������+%������w��{.n�����{���"�)��jg��������ݢj����G�������j:+v���w�m������w�������h�����٥