On Thu, Jun 20, 2019 at 05:33:03PM +0800, Guo Ren wrote: > On Wed, Jun 19, 2019 at 8:39 PM Will Deacon <will.deacon@xxxxxxx> wrote: > > > > On Wed, Jun 19, 2019 at 08:18:04PM +0800, Guo Ren wrote: > > > On Wed, Jun 19, 2019 at 5:12 PM Will Deacon <will.deacon@xxxxxxx> wrote: > > > > This is one place where I'd actually prefer not to go down the route of > > > > making the code generic. Context-switching and low-level TLB management > > > > is deeply architecture-specific and I worry that by trying to make this > > > > code common, we run the real risk of introducing subtle bugs on some > > > > architecture every time it is changed. > > > "Add generic asid code" and "move arm's into generic" are two things. > > > We could do > > > first and let architecture's maintainer to choose. > > > > If I understand the proposal being discussed, it involves basing that > > generic ASID allocation code around the arm64 implementation which I don't > > necessarily think is a good starting point. > ... > > > > > > Furthermore, the algorithm we use > > > > on arm64 is designed to scale to large systems using DVM and may well be > > > > too complex and/or sub-optimal for architectures with different system > > > > topologies or TLB invalidation mechanisms. > > > It's just a asid algorithm not very complex and there is a callback > > > for architecture to define their > > > own local hart tlb flush. Seems it has nothing with DVM or tlb > > > broadcast mechanism. > > > > I'm pleased that you think the algorithm is not very complex, but I'm also > > worried that you might not have fully understood some of its finer details. > I understand your concern about my less understanding of asid > technology. Here is > my short-description of arm64 asid allocator: (If you find anything > wrong, please > correct me directly, thx :) The complexity mainly comes from the fact that this thing runs concurrently with itself without synchronization on the fast-path. Coupled with the need to use the same ASID for all threads of a task, you end up in fiddly situations where rollover can occur on one CPU whilst another CPU is trying to schedule a thread of a task that already has threads running in userspace. However, it's architecture-specific whether or not you care about that scenario. > > The reason I mention DVM and TLB broadcasting is because, depending on > > the mechanisms in your architecture relating to those, it may be strictly > > required that all concurrently running threads of a process have the same > > ASID at any given point in time, or it may be that you really don't care. > > > > If you don't care, then the arm64 allocator is over-engineered and likely > > inefficient for your system. If you do care, then it's worth considering > > whether a lock is sufficient around the allocator if you don't expect high > > core counts. Another possibility is that you end up using only one ASID and > > invalidating the local TLB on every context switch. Yet another design > > would be to manage per-cpu ASID pools. > I'll keep my system use the same ASID for SMP + IOMMU :P You will want a separate allocator for that: https://lkml.kernel.org/r/20190610184714.6786-2-jean-philippe.brucker@xxxxxxx > Yes, there are two styles of asid allocator: per-cpu ASID (MIPS) or > same ASID (ARM). > If the CPU couldn't support cache/tlb coherency maintian in hardware, > it should use > per-cpu ASID style because IPI is expensive and per-cpu ASID style > need more software > mechanism to improve performance (eg: delay cache flush). From software view the > same ASID is clearer and easier to build bigger system with more TLB caches. > > I think the same ASID style is a more sensible choice for modern > processor and let it be > one of generic is reasonable. I'm not sure I agree. x86, for example, is better off using a different algorithm for allocating its PCIDs. > > So rather than blindly copying the arm64 code, I suggest sitting down and > > designing something that fits to your architecture instead. You may end up > > with something that is both simpler and more efficient. > In fact, riscv folks have discussed a lot about arm's asid allocator > and I learned > a lot from the discussion: > https://lore.kernel.org/linux-riscv/20190327100201.32220-1-anup.patel@xxxxxxx/ If you require all threads of the same process to have the same ASID, then that patch looks broken to me. Will _______________________________________________ kvmarm mailing list kvmarm@xxxxxxxxxxxxxxxxxxxxx https://lists.cs.columbia.edu/mailman/listinfo/kvmarm