On Wed, Jul 12, 2023 at 6:52 PM David Hildenbrand <david@xxxxxxxxxx> wrote: > > On 12.07.23 20:49, Matthew Wilcox wrote: > > On Wed, Jul 12, 2023 at 05:55:47PM +0200, David Hildenbrand wrote: > >> On 12.07.23 17:52, Matthew Wilcox wrote: > >>> On Wed, Jul 12, 2023 at 08:01:18AM -0700, Suren Baghdasaryan wrote: > >>>> Are you suggesting to break remap_pfn_range() into two stages > >>>> (remap_pfn_range_prepare() then remap_pfn_range())? > >>>> If so, there are many places remap_pfn_range() is called and IIUC all > >>>> of them would need to use that 2-stage approach (lots of code churn). > >>>> In addition, this is an exported function, so many more drivers might > >>>> expect the current behavior. > >>> > >>> You do not understand correctly. > >>> > >>> When somebody calls mmap, there are two reasonable implementations. > >>> Here's one: > >>> > >>> .mmap = snd_dma_iram_mmap, > >>> > >>> static int snd_dma_iram_mmap(struct snd_dma_buffer *dmab, > >>> struct vm_area_struct *area) > >>> { > >>> area->vm_page_prot = pgprot_writecombine(area->vm_page_prot); > >>> return remap_pfn_range(area, area->vm_start, > >>> dmab->addr >> PAGE_SHIFT, > >>> area->vm_end - area->vm_start, > >>> area->vm_page_prot); > >>> } > >>> > >>> This is _fine_. It is not called from the fault path, it is called in > >>> process context. Few locks are held (which ones aren't even > >>> documented!) > >>> > >>> The other way is to set vma->vm_ops. The fault handler in vm_ops > >>> should not be calling remap_pfn_range(). It should be calling > >>> set_ptes(). I almost have this driver fixed up, but I have another > >>> meeting to go to now. > >> > >> Just a note that we still have to make sure that the VMA flags will be set > >> properly -- I guess at mmap time is the right time as I suggested above. > > > > It actually does that already: > > > > static int gru_file_mmap(struct file *file, struct vm_area_struct *vma) > > { > > if ((vma->vm_flags & (VM_SHARED | VM_WRITE)) != (VM_SHARED | VM_WRITE)) > > return -EPERM; > > > > if (vma->vm_start & (GRU_GSEG_PAGESIZE - 1) || > > vma->vm_end & (GRU_GSEG_PAGESIZE - 1)) > > return -EINVAL; > > > > vm_flags_set(vma, VM_IO | VM_PFNMAP | VM_LOCKED | > > VM_DONTCOPY | VM_DONTEXPAND | VM_DONTDUMP); > > > > > > This compiles, but obviously I don't have a spare HP supercomputer lying > > around for me to test whether it works. Also set_ptes() was only just > > introduced to the mm tree, so doing something that needs backporting > > would take more effort (maybe having a private set_ptes() in the driver > > would be a good backport option that calls set_pte_at() in a loop). vm_flags_set() was introduced in 6.3, so not too many versions to backport to if needed. Private set_ptes() seems reasonable to me. > > > > > > diff --git a/drivers/misc/sgi-gru/grumain.c b/drivers/misc/sgi-gru/grumain.c > > index 4eb4b9455139..c21bcb528f12 100644 > > --- a/drivers/misc/sgi-gru/grumain.c > > +++ b/drivers/misc/sgi-gru/grumain.c > > @@ -951,6 +951,8 @@ vm_fault_t gru_fault(struct vm_fault *vmf) > > } > > > > if (!gts->ts_gru) { > > + pte_t *ptep, pte; > > + > > STAT(load_user_context); > > if (!gru_assign_gru_context(gts)) { > > preempt_enable(); > > @@ -964,9 +966,12 @@ vm_fault_t gru_fault(struct vm_fault *vmf) > > } > > gru_load_context(gts); > > paddr = gseg_physical_address(gts->ts_gru, gts->ts_ctxnum); > > - remap_pfn_range(vma, vaddr & ~(GRU_GSEG_PAGESIZE - 1), > > - paddr >> PAGE_SHIFT, GRU_GSEG_PAGESIZE, > > - vma->vm_page_prot); > > + > > + pte = pfn_pte(paddr / PAGE_SIZE, vma->vm_page_prot); > > + ptep = vmf->pte - (vaddr % GRU_GSEG_PAGESIZE) / PAGE_SIZE; > > + set_ptes(vma->vm_mm, vaddr & ~(GRU_GSEG_PAGESIZE - 1), > > + ptep, pte_mkspecial(pte), > > + GRU_GSEG_PAGESIZE / PAGE_SIZE); > > } > > > > preempt_enable(); > > > > Would we be able to fix it in stable simply by not triggering the > vm_flags_set() in case these flags are already set? I think we can do that. gru_file_mmap() sets all the flags that are set by remap_pfn_range_notrack() (VM_IO | VM_PFNMAP | VM_DONTEXPAND | VM_DONTDUMP), so we can check if all bits are already present and skip the vm_flags_set() call. > > -- > Cheers, > > David / dhildenb >