On 7/5/23 16:30, Johan Hovold wrote: > On Wed, Jul 05, 2023 at 02:53:17PM +0200, Pierre-Louis Bossart wrote: >> On 7/5/23 14:30, Johan Hovold wrote: >>> The soundwire subsystem uses two completion structures that allow >>> drivers to wait for soundwire device to become enumerated on the bus and >>> initialised by their drivers, respectively. >>> >>> The code implementing the signalling is currently broken as it does not >>> signal all current and future waiters and also uses the wrong >>> reinitialisation function, which can potentially lead to memory >>> corruption if there are still waiters on the queue. >> >> That change sounds good, but I am not following the two paragraphs below. >> >>> Not signalling future waiters specifically breaks sound card probe >>> deferrals as codec drivers can not tell that the soundwire device is >>> already attached when being reprobed. >> >> What makes you say that? There is a test in the probe and the codec >> driver will absolutely be notified, see bus_type.c >> >> if (drv->ops && drv->ops->update_status) { >> ret = drv->ops->update_status(slave, slave->status); >> if (ret < 0) >> dev_warn(dev, "%s: update_status failed with status %d\n", __func__, >> ret); >> } > > I'm talking about signalling the codec driver using the soundwire device > via the completion structs. Unless the underling device is detached and > reattached, trying to wait for completion a second time will currently > timeout instead of returning immediately. > > This affects codecs like rt5682, which wait for completion in component > probe (see rt5682_probe()). > >>> Some codec runtime PM >>> implementations suffer from similar problems as waiting for enumeration >>> during resume can also timeout despite the device already having been >>> enumerated. >> >> I am not following this either. Are you saying the wait_for_completion() >> times out because of the init_completion/reinit_completion confusion, or >> something else. > > It times out because the completion counter is not saturated unless you > use complete_all(). > > Drivers that wait unconditionally in resume, will time out the second > time they are runtime resumed unless the underlying device has been > detached and reattached in the meantime (e.g. wsa881x_runtime_resume()). Makes sense. The default on Intel platforms is to reset the bus in all resume cases, that forces the attachment so we never saw the issue. For this patch: Reviewed-by: Pierre-Louis Bossart <pierre-louis.bossart@xxxxxxxxxxxxxxx>