On Wed, Apr 20, 2016 at 08:10:28PM +0200, H. Nikolaus Schaller wrote: > Hi, > > > Am 20.04.2016 um 19:49 schrieb Dmitry Torokhov <dmitry.torokhov@xxxxxxxxx>: > > > > On Wed, Apr 20, 2016 at 11:22:53AM +0200, H. Nikolaus Schaller wrote: > >> > >>> Am 19.04.2016 um 10:08 schrieb H. Nikolaus Schaller <hns@xxxxxxxxxxxxx>: > >>> > >>> > >>>> Am 19.04.2016 um 10:01 schrieb Dmitry Torokhov <dmitry.torokhov@xxxxxxxxx>: > >>>> > >>>> On Tue, Apr 19, 2016 at 09:49:01AM +0200, H. Nikolaus Schaller wrote: > >>>>> > >>>>>> Am 18.04.2016 um 23:20 schrieb Dmitry Torokhov <dmitry.torokhov@xxxxxxxxx>: > >>>>>> > >>>>>> On Mon, Apr 18, 2016 at 09:55:41PM +0200, H. Nikolaus Schaller wrote: > >>>>>>> The mutex does not seem to be needed. > >>>>>> > >>>>>> twl6040_vibra_suspend() and vibra_play_work() may run concurrently, no? > >>>>> > >>>>> Hm. I don't know about the rule that would give an answer to this question... > >>>> > >>>> Sorry, that was actually a statement, not really a question. > >>> > >>> Indeed. In doubt about the answer we should take measures for the worst case. > >>> > >>>> It is > >>>> possible (although very unlikely) that userspace posts play request and > >>>> workqueue will not run until after suspend callback. > >>>> > >>>> Thinking about it some more I wonder if we better do what > >>>> twl6040_vibra_close() does and cancel the work before shutting off the > >>>> device, so that there is no chance of work executing after suspend > >>>> callback and reenabling the device. This way we can indeed remove the > >>>> mutex. > >>> > >>> Ok, I am fine with this. > >>> > >>> Will post an update ASAP. > >> > >> While doing testing I did run again into another locking related issue which I > >> had not yet tried to address with my patch set. It is a: > >> > >> BUG: scheduling while atomic > >> > >> report which sometimes ends in a kernel panic. > >> > >> I have attached such a log. vibra.py is here: > >> > >> http://git.goldelico.com/?p=gta04-kernel.git;a=blob;f=Letux/root/vibra.py;hb=refs/heads/letux-4.6-rc4 > >> > >> Basically it does an ioctl(EVIOCSFF) which triggers vibra_play. > >> > >> Maybe, can you decipher from the log what the reason could be? > >> > >> I only conjecture that it happens when vibra_play tries to read the regmap > >> of the twl6040 to get twl6040_get_vibralr_status (which has no pendant > >> in the twl4030 driver). > >> > >> Do we have to configure the twl6040 regmap differently? > > > > Right, vibra_play is called with interrupts disabled (that's why we have > > workqueue to enable/disable regulators, etc, when we start or stop > > vibration), so twl6040_get_vibralr_status() should not sleep, but > > apparently it does. > > Yes, regmap using i2c communication may sleep. > > > Maybe the check for audio configuration should be > > moved into vibra_play_work(). > > Hm. It is there to disable while in audio routing mode, but > a workqueue can't report error values back to the scheduling thread... Nothing checks result of ->play() anyways, so that -EBUSY was completly useless. > > So we can either silently make vibra not work (or just report > in the kernel log) if "Vibra is configured for audio". We might want to ratelimit that message. > > Or we need to get a different mechanism to know the vibra status. > > Hm. > > Research shows that it regmap was introduced long ago but between 3.11 and > 3.12 a private cache for these control registers was removed. > > http://lxr.free-electrons.com/source/drivers/mfd/twl6040.c?v=3.11#L66 > http://lxr.free-electrons.com/source/drivers/mfd/twl6040.c?v=3.11#L85 > http://lxr.free-electrons.com/source/drivers/mfd/twl6040.c?v=3.11#L462 > > This had been non-blocking and was probably there exactly to protect > against this locking issue! > > After removing the private cache the code must rely on twl6040_get_vibralr_status > not fetching from the chip. > > Ah, this correlates to that I see this issue only once and then everything > works. > > This means we have to fetch the current vibra control registers once > outside of vibra_play(). Probably during probing by a single call to > twl6040_get_vibralr_status() and ignoring the result. > > After it has been fetched once (to know any status from the last reboot) > the regmap should track all changes arriving through the sound subsystem > (audio vibra enable) and the call to twl6040_get_vibralr_status in interrupt > context should no longer block. > > Does this sound reasonable? Yes, as long as you document this so that it does not get removed by mistake later. Thanks. -- Dmitry -- To unsubscribe from this list: send the line "unsubscribe linux-input" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html