Re: [PATCH 6/6] soundwire: qcom: do not probe devices before bus/link init

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On 5/1/23 07:24, Krzysztof Kozlowski wrote:
> On 20/04/2023 23:37, Pierre-Louis Bossart wrote:
>>
>>
>> On 4/20/23 05:16, Krzysztof Kozlowski wrote:
>>> Soundwire devices are supposed to be kept in reset state (powered off)
>>> till their probe() or component bind() callbacks.  However if they are
>>> already powered on, then they might enumerate before the master
>>> initializes bus in qcom_swrm_init() leading to occasional errors like:
>>
>> The problem statement is really hard to follow.
>>
>> The peripheral can only be enumerated AFTER
>> a) the manager starts the bus clock and transmitting PING frames
>> b) the peripheral detects the sync words for 16 frames in a row.
>> c) the peripheral reports as Attached in the Device0 slot
>>
>> That sequence holds whether the manager does the enumeration manually or
>> relies on hardware-assisted autoenumeration. This is what the spec requires.
>>
>> So why can't the bus clock start be controlled by the manager driver,
>> and started once all required initializations are done?
>>
>> I mean, there's got to be some sort of parent-child hierarchy with
>> manager first, peripheral(s) second, I don't get how these steps could
>> be inverted or race.
>>
>>>   qcom-soundwire 6d30000.soundwire-controller: Qualcomm Soundwire controller v2.0.0 Registered
>>>   wcd938x_codec audio-codec: bound sdw:0:0217:010d:00:4 (ops wcd938x_sdw_component_ops)
>>>   wcd938x_codec audio-codec: bound sdw:0:0217:010d:00:3 (ops wcd938x_sdw_component_ops)
>>>   qcom-soundwire 6ad0000.soundwire-controller: swrm_wait_for_wr_fifo_avail err write overflow
>>>
>>> The problem primarily lies in Qualcomm Soundwire controller probe() sequence:
>>> 1. request_threaded_irq()
>>> 2. sdw_bus_master_add() - which will cause probe() and component bind()
>>>    of Soundwire devices, e.g. WCD938x codec drivers.  Device drivers
>>>    might already start accessing their registers.
>>
>> not if the bus clock hasn't started...
>>
>>> 3. qcom_swrm_init() - which initializes the link/bus and enables
>>>    interrupts.
>>
>> if you can move the clock start in 3) then problem solved. Why can't
>> this be done?
> 
> Responding to all your three responses:
> The clock is enabled in this 3. qcom_swrm_init(), so the old code to my
> knowledge is written exactly how you expect.
> 
> However even with stopped clock, the device enumerates at
> sdw_bus_master_add(), before anything is enabled.

Erm, that's not physically possible...

The peripheral can report as attached and be enumerated by the manager,
i.e. assigned a non-zero "Device Number" after the peripheral
synchronizes on 16 frames with valid static and dynamic syncwords. That
can only happen if there is a clock toggling and PING frames transmitted
on the data line.

There's something else at play here.

> I also checked the reset values of these registers - clock is off after
> reset. Assuming of course I look at correct clock registers... but I
> have only one.
> 
>>
>>> Any access to device registers at (2) above, will fail because link/bus
>>> is not yet initialized.
>>>
>>> However the fix is not as simple as moving qcom_swrm_init() before
>>> sdw_bus_master_add(), because this will cause early interrupt of new
>>> slave attached.  The interrupt handler expects bus master (ctrl->bus.md)
>>> to be allocated, so this would lead to NULL pointer exception.
>>>
>>> Rework the init sequence and change the interrupt handler.  The correct
>>> sequence fixing accessing device registers before link init is now:
>>> 1. qcom_swrm_init()
>>> 2. request_threaded_irq()
>>> 3. sdw_bus_master_add()
>>> which still might cause early interrupts, if Soundwire devices are not
>>> in powered off state before their probe.  This early interrupt issue is
>>
>> You'd need to clarify in which step the bus clock starts. In general,
>> you want to clock started last.
> 
> Clock is enabled in qcom_swrm_init() step, but as I wrote above, it
> looks like it does not matter for enumeration.

without a clock you can't have any enumeration.


>>> +	 * from reset by its probe() or bind() function, as a result of
>>> +	 * sdw_bus_master_add().
>>> +	 * Add a simple check to avoid NULL pointer except on early interrupts.
>>> +	 * Note that if this condition happens, the slave device will not be
>>> +	 * enumerated. Its driver should be fixed.
>>
>> ???
>>
>> The codec driver is NEVER involved in enumeration.
> 
> If the device stays in power down, only the driver bind can bring it on.
> enumeration won't happen when device is powered down, right?

The codec driver can indeed control the codec power with sideband links
- i.e. not with SoundWire but gpios/I2C/SPI, etc. - and it could very
well prevent the codec hardware from showing up on the bus until TBD
platform-specific criteria are met.

But that's not really taking part in the enumeration process, rather
gating the enumeration process.

>> The only thing a codec driver should do is provide a callback to be
>> notified of a status change for additional initialization, but the
>> enumeration can be done even in the absence of a codec driver.
>>
>> The proof in the pudding is that you can 'blacklist' a codec driver and
>> bind it later, after the hardware is enumerated. You can even unbind a
>> codec driver and nothing bad will happen (we hardened that sequence last
>> year).
>>
>> probe != enumeration != initialization for SoundWire codecs.
>>
>> Probe and enumeration can happen in any order
>> Initialization can only happen after both probe and enumeration happened.
> 
> I am speaking here about component_master_ops->bind() callback.

That's on the manager side, I really don't see how this is related to
the codec?

>>> +	 * removing Soundwire bus master.
>>> +	 */
>>> +	if (ctrl->version < SWRM_VERSION_2_0_0)
>>> +		ctrl->reg_write(ctrl, ctrl->reg_layout[SWRM_REG_INTERRUPT_MASK_ADDR],
>>> +				0);
>>> +	if (ctrl->mmio)
>>> +		ctrl->reg_write(ctrl, ctrl->reg_layout[SWRM_REG_INTERRUPT_CPU_EN],
>>> +				0);
>>> +
>>> +	cancel_delayed_work_sync(&ctrl->new_slave_work);
>>>  	sdw_bus_master_delete(&ctrl->bus);
>>>  	clk_disable_unprepare(ctrl->hclk);
>>
>> should the last two be inverted? Keeping a clock running while removing
>> stuff is asking for trouble.

actually it doesn't really matter, if the interrupts are disabled first
and you wait for in-flight work to be done. Ignore this comment.
> 
> It is a reversed probe(), which is usually correct. Do you expect
> probe() like:
> 
> 	clk_enable
> 	sdw_bus_master_add

it's likely the other way around,

probe():
sdw_bus_master_add
clk_enable

assuming that clk_enable() starts the bus - not sure it does based on
the answers above.




[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [Linux for Sparc]     [IETF Annouce]     [Security]     [Bugtraq]     [Linux MIPS]     [ECOS]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux