Re: [PATCH V4 22/28] PCI: tegra: Access endpoint config only if PCIe link is up

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Jun 18, 2019 at 02:32:59PM +0200, Johannes Berg wrote:
> I got to this thread really late I guess :-)
> 
> On Tue, 2019-06-18 at 12:49 +0200, Thierry Reding wrote:
> 
> > > > > > > > > > 1. WiFi devices provides power-off feature for power saving
> > > > > > > > > > in mobiles.  When WiFi is turned off we shouldn't power on
> > > > > > > > > > the HW back without user turning it back on.
> 
> But why would you disconnect the PCIe device just to power it down?!

It's a side-effect of asserting that W_DISABLE pin that the bus link
basically goes down. We've had a similar case recently, one that we
haven't quite solved either, where an RTL8169 Ethernet controller is
hooked up to a GPIO that controls the ISOLATEB (I think that was the
name) pin. If that pin is asserted, according to the documentation,
the device stops sampling/driving the PCI signals. So for all intents
and purposes it becomes disconnected.

We could kind of deal with this if the ISOLATEB was deasserted at probe
time, because that would mean that the device is at least enumerated on
PCI. Then when we go into some power down mode (for example when the
interface is taken down), the NIC driver could assert the GPIO and on
resuming from the power down mode deassert it again. Logically the
device would stay around, we just couldn't talk to it over PCI until the
driver has deasserted the ISOLATEB GPIO.

The problem is that it's not exactly defined what the status of the pin
would be at probe time. If it is asserted, the NIC will never show up on
the PCI bus and hence no driver would be registered that could deassert
the ISOLATEB signal. Well, unless we somehow created a "placeholder" PCI
device based on a device tree node (containing a reference to the GPIO)
so that the device would be enumerated (and probed) regardless of the
PCI link. There's no infrastructure to do that currently, but perhaps
worth investigating.

I think the W_DISABLE is somewhat similar. From what Manikanta was
saying, the PCI link also goes down when the pin is asserted, so we
loose any means of communicating with it over PCI.

The issue that Manikanta was trying to solve with this particular patch
was that since the PCI device is part of the PCI device hierarchy, some
userspace tools (X server, for example) will see it and try to discover
whether it's a GPU or not. This in turn causes errors from the PCI host
controller because it's trying to access a device behind a link that's
down. That, I assume, could also happen for the ISOLATEB case that I was
describing above, though it hasn't been brought up, I think.

> > > > > > > The problem that Manikanta is trying to solve here occurs in
> > > > > > > this situation (Manikanta, correct me if I've got this wrong):
> > > > > > > on some setups, a WiFi module connected over PCI will toggle a
> > > > > > > power GPIO as part of runtime suspend. This effectively causes
> > > > > > > the module to disappear from the PCI bus (i.e. it can no longer
> > > > > > > be accessed until the power GPIO is toggled again).
> > > > > > 
> > > > > > GPIO is toggled as part of WiFi on/off, can be triggered from
> > > > > > network manager UI.
> 
> That's kinda icky, IMHO.

Isn't that kind of the point of rfkill? I seem to remember having a
notebook where this was done exactly the same way. There was also a
button/switch that you could push which would result in the WiFi device
either going away completely or at the least loosing the WiFi link. It
seems like that's exactly what Manikanta is describing.

> > > > > > Correct, rfkill switch should handle the GPIO.
> > > > > > Sequence will be,
> > > > > >  - WiFi ON
> > > > > >    - rfkill switch enables the WiFi GPIO
> > > > > >    - Tegra PCIe receives hot plug event
> > > > > >    - Tegra PCIe hot plug driver rescans PCI bus and enumerates the device
> > > > > >    - PCI client driver is probed, which will create network interface
> > > > > >  - WiFi OFF
> > > > > >    - rfkill switch disables the WiFi GPIO
> > > > > >    - Tegra PCIe receives hot unplug event
> > > > > >    - Tegra PCIe hot plug driver removes PCI devices under the bus
> > > > > >    - PCI client driver remove is executed, which will remove
> > > > > >      network interface
> > > > > > We don't need current patch in this case because PCI device is not
> > > > > > present in the PCI hierarchy, so there cannot be EP config access
> > > > > > with link down.  However Tegra doesn't support hot plug and unplug
> > > > > > events. I am not sure if we have any software based hot plug event
> > > > > > trigger.
> 
> Looks reasonable to me.
> 
> I guess if you absolutely know in software when the device is present or
> not, you don't need "real" PCIe hotplug, just need to tickle the
> software right?

Right.

> > > > How does rfkill work?  It sounds like it completely removes power from
> > > > the wifi device, putting it in D3cold.  Is there any software
> > > > notification other than the "Slot present pin change" (which looks
> > > > like a Tegra-specific thing)?
> 
> Well, they said above it's a GPIO that controls it, so the software
> already knows and doesn't really need an event?

We still need to communicate from rfkill to the PCI host controller that
something happened, since they are two different entities.

> > > The rfkill subsystem provides a generic interface for disabling any radio
> > > transmitter in the system. WiFi M.2 form factor cards provide W_DISABLE
> > > GPIO to control the radio transmitter
> 
> But it depends on the hardware how this is handled, Intel NICs for
> example just trigger an IRQ to the host and don't turn off much, for
> them the W_DISABLE pin is just a GPIO in input mode, with edge triggered
> interrupt to the driver.

Okay, so does this mean you have some input device connected to the WiFi
device that will be used (without software intervention) to disable the
transmitter and then the WiFi device will signal using the W_DISABLE pin
that the transmitter was indeed disabled?

> > > and I have seen some cards provide
> > > control to turn off complete chip through this GPIO. 
> 
> I never heard of this. Which NICs are we talking about?
> 
> > Perhaps what we need here is some sort of mechanism to make rfkill and
> > the PCI host controller interoperate? I could imagine for example that
> > the PCI host controller would get a new "rfkill" property in device
> > tree that points at the rfkill device via phandle.
> 
> But you don't know which the rfkill device is, do you?
> 
> I mean, fundamentally, you just have a GPIO that turns on and off the
> W_DISABLE pin. NICs will not generally disappear from the bus when
> that's turned on, so you need a NIC driver integration.

I think that's the main problem that we're trying to solve. In our case
it does seem like the device completely disappears from the bus.

> I guess you also have an rfkill-gpio driver assigned to this GPIO, which
> gets assigned there via DT/platform code?

Yes, I think that's correct. Manikanta, please confirm.

> Ah, but then I guess you could have a phandle in the DT or so that ties
> the W_DISABLE-GPIO with the PCIe slot that it controls.

Right, that's what I was thinking.

> > The driver could then get a reference to it using something like:
> > 
> > 	rfkill = rfkill_get(dev);
> > 	if (IS_ERR(rfkill)) {
> > 		...
> > 	}
> > 
> > and register for notification:
> > 
> > 	err = rfkill_subscribe(rfkill, callback);
> > 	if (err < 0) {
> > 		...
> > 	}
> > 
> > rfkill_unsubscribe() and rfkill_put() would then be used upon driver
> > unload to detach from the rfkill.
> 
> This I don't understand.

This was just an example of what I was imagining. The network driver
would get an rfkill (looked up via device tree phandle) and subscribe to
receive events from it, so that it could be notified when the rfkill is
"blocked" and rescan the bus to get the WiFi device unplugged. Once
unblocked it would be notified again and rescan the bus so that the
device would reappear.

> > I noticed that there's an rfkill-gpio driver (net/rfkill/rfkill-gpio.c)
> > that already does pretty much everything that we need, except that it
> > doesn't support DT yet, but I suspect that that's pretty easy to add.
> 
> Oh, good point, no DT support here - so how *do* you actually
> instantiate the rfkill today??

I suspect that we've got downstream patches for that. The patch here is
part of a series to upstream support for this. I haven't seen the patch
for rfkill-gpio, but perhaps that's queued for later.

> > Johannes, any thoughts on this. In a nutshell what we're trying to solve
> > here is devices that get removed from/added to PCI based on an rfkill-
> > type of device. The difference to other implementations is that we have
> > no way of detecting when the device has gone away (PCI hotplug does not
> > work). So we'd need some software-triggered mechanism to let the PCI
> > host controller know when the device is presumably going away or being
> > added back, so that the PCI bus can be rescanned and the PCI device
> > removed or added at that point).
> 
> Right.
> 
> So, I'm not even sure we need the *driver* to do anything other than say
> "I know the device will drop off the bus when rfkill is enabled", right?
> 
> 
> But do we actually need rfkill to be involved here?
> 
> I mean, let's say first we make rfkill-gpio DT-aware, rather than just
> ACPI. This should be simple. Then it drives a GPIO (it can actually
> drive two and a clock, not sure I know why).
> 
> Now, next we need something that says that the device should be treated
> as hotplug/unplug. We could make this in the driver somehow like you
> suggested, but that seems like a lot of effort?
> 
> Couldn't we put this into the *GPIO* subsystem instead?
> 
> I mean - conceivably there could be GPIOs that just power down a device
> for example. Not even through something like W_DISABLE, but just having
> a GPIO hooked up to a transistor on the voltage pin of the device. That
> would have very similar semantics?
> 
> So why not just attach the PCIe device/port to the GPIO, and have the
> GPIO implementation here call the detach/attach (or detach/rescan?) when
> they are toggled?
> 
> Not that I'd mind having it in rfkill! But it seems like a special case
> to have it there, when you can do so much more with GPIOs.

Yeah, that's where things become a little muddy. For the ISOLATEB case
there was initially a similar proposal. The problem is that on one hand
we can have different semantics for these pins. On one platform this
could be a kind of "power" GPIO, on others it could be ISOLATE/DISABLE,
and on yet others it would be more like a reset. In order to make the
PCIe port aware of the differences we'd have to expose multiple GPIOs in
DT for context.

The other problem with this is that, in order to avoid the chicken-and-
egg problem, we need to associate these GPIOs with the root ports,
because those are the only ones that exist at probe time. All downstream
devices may not be available because the power/reset/disable pin is not
asserted/deasserted yet. Now, you could potentially have a switch in the
downstream hierarchy, so it becomes completely unclear what exact device
the GPIO is associated with.

Related to that, a GPIO like this is really only useful if you can make
use of it. For example you want to assert/deassert this GPIO in order to
put the WiFi/Ethernet/whatever device into a low-power mode when it is
not used, right? But in order to do so, the driver for that device needs
to be able to handle the GPIO, because it is the only one that knows the
right point in time to toggle it. Conversely, if this was associated
with the root port, the only point in time where the root port driver
could toggle it is on a suspend/resume of the entire bus, which makes it
rather useless.

But then we're back to square one where we basically have to associate
the GPIO with the specific device. I think that's the right thing to do
because, well, that's what reality is. The GPIO is directly routed to a
pin on the chip. It's not something that goes over the PCI connector or
anything. However, we're also back to the chicken-and-egg problem since
without toggling the GPIO the device might not even get enumerated.

rfkill-gpio has the advantage that it decouples this and gets us out of
the chicken-and-egg situation. It also has fairly well-defined semantics
and fits the use-case, so it's a very appealing option.

Thierry

Attachment: signature.asc
Description: PGP signature


[Index of Archives]     [ARM Kernel]     [Linux ARM]     [Linux ARM MSM]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux