Re: [git pull] drm fixes for 6.11-rc6

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, 2024-09-02 at 08:13 +1000, Dave Airlie wrote:
> On Fri, 30 Aug 2024 at 12:32, Linus Torvalds
> <torvalds@xxxxxxxxxxxxxxxxxxxx> wrote:
> > 
> > On Fri, 30 Aug 2024 at 14:08, Dave Airlie <airlied@xxxxxxxxx>
> > wrote:
> > > 
> > > The TTM revert is due to some stuttering graphical apps probably
> > > due
> > > to longer stalls while prefaulting.
> > 
> > Yeah, trying to pre-fault a PMD worth of pages in one go is just
> > crazy talk.
> > 
> > Now, if it was PMD-aligned and you faulted in a single PMD, that
> > would
> > be different. But just doing prn_insert_page() in a loop is insane.
> > 
> > The code doesn't even stop when it hits a page that already
> > existed,
> > and it keeps locking and unlocking the last-level page table over
> > and
> > over again.
> > 
> > Honestly, that code is questionable even for the *small* value,
> > much
> > less the "a PMD size" case.
> > 
> > Now, if you have an array of 'struct page *", you can use
> > vm_insert_pages(), and that's reasonably efficient.
> > 
> > And if you have a *contiguous* are of pfns, you can use
> > remap_pfn_range().
> > 
> > But that "insert one pfn at a time" that the drm layer does is
> > complete garbage. You're not speeding anything up, you're just
> > digging
> > deeper.


> 
> I wonder if there is functionality that could be provided in a common
> helper, by the mm layers, or if there would be too many locking
> interactions to make it sane,
> 
> It seems too fraught with danger for drivers or subsystems to be just
> doing this in the simplest way that isn't actually that smart.

Hmm. I see even the "Don't error on prefaults" check was broken at some
point :/.

There have been numerous ways to try to address this,

The remap_pfn_range was last tried, at least in the context of the i915
driver IIRC by Christoph Hellwig but had to be ripped out since it
requires the mmap_lock in write mode. Here we have it only in read
mode.

Then there's the apply_to_page_range() used by the igfx functionality
of the i915 driver. I don't think we should go that route without
turning it into something like vm_insert_pfns() with proper checking.
This approach populates all entries of a buffer object.

Finally there's the huge fault attempt that had to be ripped out due to
lack of pmd_special and pud_special flags and resulting clashes with
gup_fast.

Perhaps a combination of the two latter if properly implemented would
be the best choice.

/Thomas

> 
> Dave.





[Index of Archives]     [Linux DRI Users]     [Linux Intel Graphics]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [XFree86]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Linux Kernel]     [Linux SCSI]     [XFree86]
  Powered by Linux