Re: [PATCH 000/182] Rid struct gpio_chip from container_of() usage

Johan Hovold <johan@xxxxxxxxxx> · Mon, 14 Dec 2015 13:46:26 +0100

On Wed, Dec 09, 2015 at 03:46:00PM +0100, Linus Walleij wrote:
> On Wed, Dec 9, 2015 at 2:44 PM, Russell King - ARM Linux
> <linux@xxxxxxxxxxxxxxxx> wrote:

> > On Wed, Dec 09, 2015 at 02:08:35PM +0100, Linus Walleij wrote:
> >> Because we want to have a proper userspace ABI for GPIO chips,
> >> which involves using a character device that the user opens
> >> and closes. While the character device is open, the underlying
> >> kernel objects must not go away.
> >
> > Okay, so you stop the gpio_chip struct from going away.
> 
> Actually I was going to create a new struct gpio_device, but yes.
> 
> > What
> > about the code which is called via gpio_chip - say, if userspace
> > keeps the chardev open, and someone rmmod's the driver providing
> > the GPIO driver.
> 
> The idea was that when what is today gpiochip_remove() is called by the
> gpiochip driver, the static data pointer to the vtable with all
> callbacks is set to NULL (in some safe way), and the code avoids
> calling these, so the device goes numb.
> 
> I think you give me the right solution to this "safe way" below,
> thanks!
> 
> > I'm not sure that splitting up objects in this way really solves
> > anything at all.  Yes, it divorses the driver's private data from
> > the subsystem data, but is that really an advantage?
> 
> I like the design pattern you describe below, and
> I have no strong opinion on it. If there is a precedent in the kernel
> to do it with a separate alloc_foo() function I can do that.
> 
> (Would like Greg's and/or Johan's opinion on this though.)

I'd prefer separating allocation and registration rather than using a
setup callback.

> > Things get a little more complex with gpio, because there's the
> > issue that some methods are spinlocked while others can take
> > semaphores, but it should be possible to come up with a solution
> > to that - maybe an atomic_t which is incremented whenever we're
> > in some operation provided it's >= 0 (otherwise it fails), and
> > decremented when the operation completes.  We can then control
> > in the unregistration path further GPIO accesses, and also
> > prevent new accesses occuring by setting the atomic_t to -1.
> > This shouldn't require any additional locking in any path.  It
> > does mean that the unregistration path needs careful thought to
> > ensure that when we set it to -1, we wait for it to be dropped
> > by the appropriate amount.
> 
> Yeah we will both in spinlocks/hotpath and
> semaphores/mutexes/slowpath in these ops for sure :/
> 
> The atomic hack should work: I understand that you mean
> we read it and set it to -1 and if it was 2 wait for it to
> become -3, then conclude unregistration with callbacks
> numbed.

Ok, but let's take a step back. So you have all this in place and a
consumer calls gpiod_get_value() that returns an errno because the device
is gone. Note that this wasn't even possible before e20538b82f1f ("gpio:
Propagate errors from chip->get()") that went into *v4.3*, and I assume
most drivers would need to be updated to even handle that that gpio
call, and all future calls, are now suddenly failing.

So this ties into the generic problem of inter-device dependencies. Does
it even make sense to keep the consumer around, now that its gpio
resources have gone away?  

If your current concern is a new gpio chardev interface, perhaps solving
only that use case in the way that Dimitry suggested elsewhere in this
thread is what you should be doing.

> Then there is a particular case that occurs with USB or similar
> pluggable buses, where there is a glitch or powercycle on the
> bus, and the same device goes away and comes back in
> a few milliseconds, and that means it should reattach to the
> character device that is already open.

No, that does not follow. A USB device being disconnected and
reconnected is considered to be a new device. All state is gone, and the
dangling character device will be unusable.

> Making that work is however non-trivial :(

Fortunately, you don't need to worry about that.

Johan