On Fri, Jun 09, 2017 at 09:40:47AM +0200, Martin Fuzzey wrote: > On 09/06/17 03:57, Luis R. Rodriguez wrote: > > On Thu, Jun 8, 2017 at 6:10 PM, Luis R. Rodriguez <mcgrof@xxxxxxxxxx> wrote: > > > > Android didn't send the signal, the kernel did (SIGCHLD). > > > > > > > > Like this: > > > > > > > > 1) Android init (pid=1) fork()s (say pid=42) [this child process is totally > > > > unrelated to firmware loading] > > > > 2) Android init (pid=1) does a write() on a (driver custom) sysfs file which > > > > ends up calling request_firmware() kernel side > > > > 3) The firmware loading fallback mechanism is used, the request is sent to > > > > userspace and pid 1 waits in the kernel on wait_* > > > > 4) before firmware loading completes pid 42 dies (for any reason - in my > > > > case normal termination) > > Martin just to be clear, by "normal case termination" do you mean > > completing successfully ?? Ie the firmware actually did make it onto > > the device ? > > The firmware did *not* make it onto the device since the request_firmware() > call returned an error > (the code that would have transfered it to the device is only executed > following a successful request_firmware) > > The process that terminates normally is unrelated to firmware loading as I > said above. > > The only things that matter are: > - It is a child process of the process that calls request_firmware() > - It terminates *while* the the wait_ is still in progress > > > Here is a way of reproducing the problem using the test_firmware module > (which I only just saw) on normal linux with no Android or custom driver > > > #!/bin/sh > set -e > > # Make sure the system firmware loader doesn't get in the way > /etc/init.d/udev stop > > modprobe test_firmware > > DIR=/sys/devices/virtual/misc/test_firmware > > echo 10 >/sys/class/firmware/timeout; > sleep 2 & > echo -n "/some/non/existing/file.bin" > "$DIR"/trigger_request; > > > > If run with the "sleep 2 &" it terminates after 2 seconds > If the sleep is commented it runs for the expected 10 seconds (the firmware > loading timeout) > > Since the sleep process is a child of the script process requesting a > firmware load its death causes a SIGCHLD causing request_firmware() to abort > prematurely. Thanks this could mean we also *should* trigger a failure if init is issuing modprobe on a series of drivers and one completes before another while request_firmware() is called on init or probe on a subsequent driver. If true I'm surprised this never was reported back when the fallback mechanism was popular, I suppose it was not an issue given most firmware *was* present on /lib/firmware/ and the direct filesystem lookup first step always found the firmware first, so this would only be an issue for folks relying on the fallback mechanism exclusively. Will include a test case based on your above script. Thanks! Luis