On Wed, Nov 16, 2016 at 12:11:58PM -0600, Bjorn Helgaas wrote: > Hi Johannes, > > On Wed, Nov 02, 2016 at 04:35:52PM -0600, Johannes Thumshirn wrote: > > The Read Completion Boundary (RCB) bit must only be set on a device or > > endpoint if it is set on the root complex. > > I propose the following slightly modified patch. The interesting > difference is that your patch only touches the _HPX "OR" mask, so it > refrains from *setting* RCB in some cases, but it never actually > *clears* it. The only time we clear RCB is when the _HPX "AND" mask > has RCB == 0. > > My intent below is that we completely ignore the _HPX RCB bits, and we > set an Endpoint's RCB if and only if the Root Port's RCB is set. > > I made an ugly ASCII table to think about the cases: > > Root EP _HPX _HPX Final Endpoint RCB state > Port (init) AND OR (curr) (yours) (mine) > 0) 0 0 0 0 0 0 0 > 1) 0 0 0 1 1 0 0 > 2) 0 0 1 0 0 0 0 > 3) 0 0 1 1 1 0 0 > 4) 0 1 0 0 0 0 0 > 5) 0 1 0 1 1 0 0 > 6) 0 1 1 0 1 1 0 > 7) 0 1 1 1 1 1 0 > 8) 1 0 0 0 0 0 1 > 9) 1 0 0 1 1 1 1 > A) 1 0 1 0 0 0 1 > B) 1 0 1 1 1 1 1 > C) 1 1 0 0 0 0 1 > D) 1 1 0 1 1 1 1 > E) 1 1 1 0 1 1 1 > F) 1 1 1 1 1 1 1 > > Cases 0-7 should all result in the Endpoint RCB being zero because the > Root Port RCB is zero. Case 1 is the bug you're fixing. Cases 3 & 5 > are similar hypothetical bugs your patch also fixes. > > Cases 6 & 7, where firmware left the Endpoint RCB set and _HPX didn't > tell us to clear it, are hypothetical firmware bugs that your patch > wouldn't fix. > > In cases 8, A, and C, we currently leave the Endpoint RCB cleared, > either because firmware left it clear and _HPX didn't tell us to set > it (8 and A), or because firmware set it but _HPX told us to clear it > (C). > > One could argue that 8, A, and C should stay as they currently are, as > a way for _HPX to work around hardware bugs, e.g., a Root Port that > advertises a 128-byte RCB but doesn't actually support it. I didn't > bother with that and set the Endpoint's RCB to 128 in all cases when > the Root Port claims to support it. > > It'd be great if you could test this and comment. I've lost access to the machines, but I'll try to delegate it to someone who has access. > > If you get a chance, collect the /proc/iomem contents, too. That's > not for this bug; it's because I'm curious about the > > ERST: Can not request [mem 0xb928b000-0xb928cbff] for ERST > > problem in your dmesg log. I'll ask for this as well. Byte, Johannes -- Johannes Thumshirn Storage jthumshirn@xxxxxxx +49 911 74053 689 SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg GF: Felix Imendörffer, Jane Smithard, Graham Norton HRB 21284 (AG Nürnberg) Key fingerprint = EC38 9CAB C2C4 F25D 8600 D0D0 0393 969D 2D76 0850 -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html