On 3/14/2024 8:38 AM, Johan Hovold wrote:
On Wed, Mar 13, 2024 at 10:24:08AM -0700, Abhinav Kumar wrote:
On 3/13/2024 1:18 AM, Johan Hovold wrote:
Right, but your proposed fix would not actually fix anything and judging
from the sparse commit message and diff itself it is clearly only meant
to mitigate the case where user space is involved, which is *not* the
case here.
There can be a race condition between the time the DP driver gets the
hpd disconnect event and when the hpd thread processes that event
allowing the commit to sneak in. This is something which has always been
there even without pm_runtime series and remains even today.
In this race condition, the setting of "link_ready" to false can be a
bit delayed if we go through the HPD event processing increasing the
race condition window.
If link_ready is false, atomic_check() fails, thereby failing any
commits and hence not allowing the atomic_disable() / atomic_enable()
cycle and hence avoiding this reset.
The patch is moving the setting of link_ready to false earlier by not
putting it through the HPD event thread and hence trying to reduce the
window of the issue.
Perhaps I'm missing something in the race that you are trying to
describe (and which I've asked you to describe in more detail so that I
don't have to spend more time trying to come up with a reproducer
myself).
The race condition is between the time we get disconnect event and set
link_ready to false, a commit can come in. Because setting link_ready to
false happens in the event thread so it could be slightly delayed.
It will be hard to reproduce this. Only way I can think of is to delay
the EV_NOTIFICATION for sometime and see in dp_bridge_hpd_notify()
else if (dp_display->link_ready && status ==
connector_status_disconnected)
dp_add_event(dp, EV_HPD_UNPLUG_INT, 0, 0);
as dp_add_event() will add the event, then wakeup the event_q.
Before the event thread wakes up and processes this unplug event, the
commit can come in. This is the race condition i was thinking of.
I do understand how your patch works, but my point is that it does
not fix the race that we are hitting on sc8280xp and, unless I'm missing
something, it is not even sufficient to fix the race you are talking
about as user space can still trigger that ioctl() before you clear the
link_ready flag.
That's why I said that it is only papering over the issue by making the
race window smaller (and this should also be highlighted in the commit
message).
Yes, I have already accepted this part. It only reduces the race window
smaller.
For some reason it also made things worse on sc8280xp, but I did not
spend time on tracking down exactly why.
This part I agree. I need to check why sc8280xp again does not like this
patch. You dont have to spend time, I will do it and till then I will
hold this patch off.
Johan