On Sun, Jan 19, 2025 at 09:40:17PM -0500, Rik van Riel wrote: > +#ifdef CONFIG_X86_BROADCAST_TLB_FLUSH > +/* > + * Logic for broadcast TLB invalidation. > + */ > +static DEFINE_RAW_SPINLOCK(global_asid_lock); > +static u16 last_global_asid = MAX_ASID_AVAILABLE; > +static DECLARE_BITMAP(global_asid_used, MAX_ASID_AVAILABLE) = { 0 }; > +static DECLARE_BITMAP(global_asid_freed, MAX_ASID_AVAILABLE) = { 0 }; > +static int global_asid_available = MAX_ASID_AVAILABLE - TLB_NR_DYN_ASIDS - 1; > + > +static void reset_global_asid_space(void) > +{ > + lockdep_assert_held(&global_asid_lock); > + > + /* > + * A global TLB flush guarantees that any stale entries from > + * previously freed global ASIDs get flushed from the TLB > + * everywhere, making these global ASIDs safe to reuse. > + */ > + invlpgb_flush_all_nonglobals(); > + > + /* > + * Clear all the previously freed global ASIDs from the > + * broadcast_asid_used bitmap, now that the global TLB flush > + * has made them actually available for re-use. > + */ > + bitmap_andnot(global_asid_used, global_asid_used, > + global_asid_freed, MAX_ASID_AVAILABLE); > + bitmap_clear(global_asid_freed, 0, MAX_ASID_AVAILABLE); > + > + /* > + * ASIDs 0-TLB_NR_DYN_ASIDS are used for CPU-local ASID > + * assignments, for tasks doing IPI based TLB shootdowns. > + * Restart the search from the start of the global ASID space. > + */ > + last_global_asid = TLB_NR_DYN_ASIDS; > +} > + > +static u16 get_global_asid(void) > +{ > + lockdep_assert_held(&global_asid_lock); > + > + do { > + u16 start = last_global_asid; > + u16 asid = find_next_zero_bit(global_asid_used, MAX_ASID_AVAILABLE, start); > + > + if (asid >= MAX_ASID_AVAILABLE) { > + reset_global_asid_space(); > + continue; > + } > + > + /* Claim this global ASID. */ > + __set_bit(asid, global_asid_used); > + last_global_asid = asid; > + global_asid_available--; > + return asid; > + } while (1); > +} Looking at this more... I'm left wondering, did 'we' look at any other architecture code at all? For example, look at arch/arm64/mm/context.c and see how their reset works. Notably, they are not at all limited to reclaiming free'd ASIDs, but will very aggressively take back all ASIDs except for the current running ones. And IIRC more architectures are like that (at some point in the distant past I read through the tlb and mmu context crap from every architecture we had at that point -- but those memories are vague). If we want to move towards relying on broadcast TBLI, we'll need to go in that direction. Also, as argued in the old thread yesterday, we likely want more PCID bits -- in the interest of competition we can't be having less than ARM64, surely :-) Anyway, please drop the crazy threshold thing, and if you run into falling back to IPIs because you don't have enough ASIDs to go around, we should 'borrow' some of the ARM64 code -- RISC-V seems to have borrowed very heavily from that as well.