On 01/01/2025 18:23, Rik van Riel wrote:
On Wed, 2025-01-01 at 16:15 +0000, Karim Manaouil wrote:
On Wed, Jan 01, 2025 at 05:20:01PM +0200, Nadav Amit wrote:
On 01/01/2025 6:42, Rik van Riel wrote:
Fixed for the next version.
Thanks Rik,
Admittedly, I don't feel great about my overall last review - it
mostly
focused on style and common BKMs.
I still don't quite get the entire logic. To name one thing that I
don't
understand: why do we need broadcast_asid_list and the complicated
games of
syncing it with broadcast_asid_used. Why wouldn't
broadcast_asid_used
suffice?
If I uderstand correctly from Rik's patch, I think the list is needed
to
save the flush for only when we run out of the ASID space (wrap
around).
Without the list, whenever the ASID bit is cleared, you also have to
flush
the TLBs.
That's exactly it.
The list will only contain processes that are active on
multiple CPUs, and hit a TLB flush "at the right moment"
to be assigned a broadcast ASID, which will be true for
essentially every process that does a lot of TLB flushes
and is long lived.
However, something like a kernel build has lots of
short lived, single threaded processes, for which we
should not be using broadcast TLB flushing, and which
will not need to remove themselves from the list at
exit time.
Thank you Karim and Rik for the patient explanations.
But IIUC, it does seem a bit overly complicated (and when I say
complicated, I mostly refer to traversing the broadcast_asid_list and
its overhead).
It seems to me that basically you want to have two pieces of data that
can easily be kept in bitmaps.
1. (existing) broadcast_asid_used - those that are currently "busy" and
cannot be allocated.
2. (new) broadcast_asid_pending_flush - those that are logically free
but were still not flushed. This would allow removing broadcast_asid_list.
Then in reset_broadcast_asid_space(), you just do something like:
bitmap_andnot(broadcast_asid_used, broadcast_asid_used,
broadcast_asid_pending_flush, MAX_ASID_AVAILABLE);
bitmap_clear(broadcast_asid_pending_flush, MAX_ASID_AVAILABLE);
Seems to me as simpler to understand, faster, and may even allow in the
future to avoid locking in fast-paths (would require different ordering
and some thought). Of course, I might be missing something...