Hi, On Wed, Apr 3, 2019 at 10:04 AM Brian Norris <briannorris@xxxxxxxxxxxx> wrote: > > I know some of this was hashed out last night, but I wasn't reading my > email then to interject ;) > > On Wed, Apr 3, 2019 at 9:05 AM Douglas Anderson <dianders@xxxxxxxxxxxx> wrote: > > +static int cros_ec_xfer_high_pri(struct cros_ec_device *ec_dev, > > + struct cros_ec_command *ec_msg, > > + cros_ec_xfer_fn_t fn) > > +{ > > + struct cros_ec_xfer_work_params params; > > + > > + INIT_WORK(¶ms.work, cros_ec_xfer_high_pri_work); > > + params.ec_dev = ec_dev; > > + params.ec_msg = ec_msg; > > + params.fn = fn; > > + init_completion(¶ms.completion); > > + > > + /* > > + * This looks a bit ridiculous. Why do the work on a > > + * different thread if we're just going to block waiting for > > + * the thread to finish? The key here is that the thread is > > + * running at high priority but the calling context might not > > + * be. We need to be at high priority to avoid getting > > + * context switched out for too long and the EC giving up on > > + * the transfer. > > + */ > > + queue_work(system_highpri_wq, ¶ms.work); > > Does anybody know what the definition of "too long" is for the phrase > "Don't queue works which can run for too long" in the documentation? Using the docs that Matthias pointed at: https://www.kernel.org/doc/html/v5.0/core-api/workqueue.html ...take a look at the "Example Execution Scenarios". They show tasks that burn CPU cycles for 5 ms and this seems to be illustrating tasks that workqueues are designed for. That makes me feel like 5 ms is not too long. I don't think we'll be running for that long in our case. While a few tegra device trees have a big delay of 2000 for "google,cros-ec-spi-msg-delay", I've always been of the opinion that delay was added to work around other bugs and is probably not really needed, so if somehow this regresses performance there we could try tuning that down. ...but even in that case we're only delaying for 2000 us = 2ms. The "google,cros-ec-spi-pre-delay" on rk3288-veyron chromebooks is 30 us. There are some other delays hardcoded in the driver but those are all expressed in terms of nanoseconds. It looks like in the extreme case, though, we still could end up having our work take 200 ms for completion as per the comments in "EC_MSG_DEADLINE_MS". IIRC this is a super uncommon case and only happened on a few systems when the battery was in a particularly wonky state and decided to do clock stretching. Also: we shouldn't even be busy-waiting during this time since most of it will be sitting (presumably blocked) waiting for SPI bytes to come in. So I think the summary is that I'm back to being convinced that the "system_highpri_wq" should be OK. Running lots of work with no blocking in the system high priority workqueue can truly block other work items there, but it seems like the amount of time we're running is within the time bounds expected. The work also isn't going to fully block other workers--they will be able to run on other CPUs still and (if they are already running) presumably they can still be scheduled back in. > > + wait_for_completion(¶ms.completion); > > I think flush_workqueue() was discussed and rejected, but what about > flush_work()? Then you don't have to worry about the rest of the > contents of the workqueue -- just your own work--and I think you could > avoid the 'completion'. > > You might also have a tiny race in the current implementation, since > (a) you can't queue the same work item twice and > (b) technically, the complete() call is still while the work item is > running -- you don't really guarantee the work item has finished > before you continue. > So the combination of (a) and (b) means that moving from one xfer to > the next, you might not successfully queue your work at all. You could > probably test this by checking the return value of queue_work() under > a heavy EC workload. Avoiding the completion would also avoid this > race. Thanks, I'll do that in v3 assuming someone doesn't convince me to switch back to a private workqueue. -Doug _______________________________________________ Linux-rockchip mailing list Linux-rockchip@xxxxxxxxxxxxxxxxxxx http://lists.infradead.org/mailman/listinfo/linux-rockchip