On Wed, May 24 2023 at 21:41, Baoquan He wrote: > On 05/24/23 at 02:44pm, Thomas Gleixner wrote: >> On Wed, May 24 2023 at 19:24, Baoquan He wrote: >> Again: It _CANNOT_ be on the purge list because it has active mappings: >> >> 1 X = vb_alloc() >> ... >> Y = vb_alloc() >> vb->free -= order; // Free space goes to 0 >> if (!vb->vb_free) >> 2 list_del(vb->free_list); // Block is removed from free list >> ... >> vb_free(Y) >> vb->dirty += order; >> 3 if (vb->dirty == VMAP_BBMAP_BITS) // Condition is _false_ >> // because #1 $X is still mapped >> // so block is _NOT_ freed and >> // _NOT_ put on the purge list > > So what if $X is unmapped via vb_free($X)? Does the condition satisfied > and can the vb put into purge list? Yes, but it is _irrelevant_ for the problem at hand. > In your above example, $Y's flush is deferred, but not missed? Yes, but that violates the guarantee of vm_unmap_aliases(): * The vmap/vmalloc layer lazily flushes kernel virtual mappings primarily * to amortize TLB flushing overheads. What this means is that any page you * have now, may, in a former life, have been mapped into kernel virtual * address by the vmap layer and so there might be some CPUs with TLB entries * still referencing that page (additional to the regular 1:1 kernel mapping). * * vm_unmap_aliases flushes all such lazy mappings. After it returns, we can * be sure that none of the pages we have control over will have any aliases * from the vmap layer. >> 4 unmap_aliases() >> walk_free_list() // Does not find it because of #2 >> walk_purge_list() // Does not find it because of #3 >> >> If the resulting flush range is not covering the $Y TLBs then stale TLBs >> stay around. > > OK, your mean the TLB of $Y will stay around after vb_free() until > the whole vb becomes dirty, and fix that in this patch, you are right. > vm_unmap_aliases() may need try to flush all unmapped ranges in > this case but failed on $Y, while the page which is being reused has the > old alias of $Y. vm_unmap_aliases() _must_ guarantee that the old TLBs for $Y are gone. > My thought was attracted to the repeated flush of vmap_block va on purge > list. > > By the way, you don't fix issue that in vm_reset_perms(), the direct map > range will be accumulated with vb va and purge va and could produce > flushing range including huge gap, do you still plan to fix that? I > remember you said you will use array to gather ranges and flush them one > by one. One thing at a time. This series is a prerequisite. Thanks, tglx