[ANNOUNCE] xf86-video-intel 2.20.18

chris at chris-wilson.co.uk (Chris Wilson) · Wed, 16 Jan 2013 13:02:40 +0000

A bunch of miscellaneous fixes for assertion failures and various
performance regressions when mixing new methods for offloads, along with
a couple of improvements for rendering with gen4.

 * Remove use of packed unnormalized texture coordinates on gen4/5 as
   these GPUs do not support unnormalized coordinates in the sampler.

 * Remove dependency upon x86 asm for cross-building to unsupported
   architectures.
   https://bugs.gentoo.org/show_bug.cgi?id=448570

 * Apply damage around PRIME updates in the correct order.

 * Correctly read the initial backlight level for when the user
   overrides UXA's choice of backlight controller.

 * Throttle UXA and prevent it queuing work much faster than the GPU can
   complete it. This realised itself in impossible performance figures and
   the entire display freezing for several seconds whlist the GPU caught
   up. One side effect is that it also caused the DDX to consume more
   memory than was required as it could not recycle buffers quick
   enough, and in some cases this produces a marked improvement in
   performance. Also note on gen2/3 this requires a new libdrm [2.4.41]
   in order to prevent a bug causing the DDX to fallback to swrast.

Chris Wilson (98):
      sna: Explicitly track self-relocation entries
      sna/gen6+: Tweak to only consider active ring on destination
      sna: Do not try and set a 0x0 mode
      sna/gen6+: Tidy up ring preferences
      sna/gen2,3: Remove gen-specific vertex_offset
      sna: Sanity check config->compat_output
      sna: Skip copying fbcon if we are already on the scanout
      sna: Mark kgem_bo_retire() as static
      sna: Only allocate a busy CPU bo for a GPU readback
      sna/gen4+: Tidy emit_spans_solid()
      sna/gen4+: Tidy emit_spans_affine()
      sna/gen4+: Trim an extraneous coordinate from solid span emission
      sna/gen4+: Trim an extraneous coordinate from solid composite emission
      sna/gen4+: Check for a spare exec slot for an outstanding vbo
      sna/gen3: Use inline transform+scale function
      sna: DBG compile fixes
      sna: Allow a flush to occur before batching a flush-bo
      sna: Move the primary color cache into the alpha cache
      sna/gen4+: Try using the BLT before doing a tiled copy
      sna/dri: Gracefully handle failures from pageflip
      sna: Seed the solid color cache with an invalid value to prevent false hits
      uxa: Align surface allocations to even tile rows
      sna/dri: Fix triple buffering to not penalise missed frames
      sna/dri: Use the default choice of backend for copying the region
      sna/gen6+: Hint that we prefer to use the BLT with uncached scanouts
      sna/gen4: Tweak single-thread SF w/a for solids
      sna/gen2: Tidy a pair of vertex emitters
      sna/gen2: Always try to use the BLT pipeline first
      sna: Tidy compat interfaces
      sna: Remove some obsolete Options
      sna: Micro-optimise glyph_valid()
      sna: Fast path inplace addition of solid trapezoids
      sna/gen6+: Remove vestigial CC viewport state
      sna/gen4+: Tidy special handling of 2s2s vertex elements
      sna/gen2+: Precompute the affine transformation scale factors
      sna/gen4+: Specialise linear vertex emission
      sna/gen6+: Fine tune placement of DRI copies
      sna: Fix off-by-one in C version of fls
      sna: Also recognise __i386__ for fls asm
      sna: Add a pair of asserts to validate fls()/cache_bucket()
      sna: Convert allocation request from bytes to num_pages when shrinking
      sna: Flush the batch prior to referencing work from another ring
      sna: Embed the pre-allocation of the static request into the device
      sna: Clear up the caches after handling a request allocation failure
      sna/trapezoids: filter out zero-length runs
      sna/trapezoids: filter out cancelling edges upon insertion to edge-list
      Revert "sna/gen4+: Backport tight vertex packing for simple renderblits"
      sna/gen4+: Trim the redundant float from the fill vertices
      sna/gen4+: Handle solids passed to the general texcoord emitter
      sna: Try to create userptr with the unsync'ed flag set first
      sna: Only disable upon a failed pageflip after at least one pipe flips
      sna/dri: Transfer the DRI2 reference to the new TearFree pixmap
      sna: fixup damage posting to be done correctly around slave pixmap
      sna: Open-code xf86CompatOutput() to avoid invalid pointers
      sna: Make sure all outputs are disabled if no CompatOutput is defined
      sna: With a GPU bo and a shm source, do not fall all the way back
      sna: Ignore the last pixmap cpu setting if overwritting all damage
      sna: Allow CPU bo to copy to GPU bo if the device is idle.
      sna: Prefer to use the GPU for copies from SHM onto tiled destinations
      sna: Use some surplus bits to back our temporary pixman_image_t
      intel: Throttle harder
      sna: Prefer userptr if copying to a tiled bo
      sna: Also prefer to use the GPU for uploads into a tiled bo
      sna: Disable memcpy_to_tiled_x() uploads on 32-bit systems
      sna: Reorder struct kgem_bo to move related data into the same cacheline
      sna/dri: Prefer to preserve the ring of the destination bo
      sna: After a size check, double check the batch before flushing
      sna: Tweak max object sizes to take account of aperture restrictions
      sna: Experiment with a CPU mapping for certain fallbacks
      sna: Correct a few assertions after enabling read-only mappings
      sna: Relax limitation on not mapping GPU bo with shadow pointers
      sna: Allow creation of a CPU map for pixmaps if needed
      sna: Allow large image uploads to utilize temporary mappings
      sna: Use the pixmap size (not drawable) to determine replacement
      sna: Add a compile flag for measuring impact of userptr uploads
      sna: Check size against aperture before attempting to perform the GTT mapping
      sna: Initialize src_bo to detect allocation failure
      sna: Use userptr to accelerate GetImage
      sna: Apply PutImage optimisations to move-to-cpu
      sna: Limit temporary userptr uploads to large busy targets or LLC machines
      sna: Tweak considering of last-cpu placement for inplace regions
      sna: Avoid allocating an active CPU bo unnecessarily
      sna: Mark uploads with async hints when appropriate
      sna: Free the SHM pixmaps after b266ae6f6f
      sna: Pass the async hint for the upload into the GPU
      sna: Hint that a copy from a SHM bo will likely be the last in a batch
      sna: Add DBG to use_shm_bo()
      sna/trapezoids: Avoid the multiply for an opaque source
      sna: Specialise sna_get_image_blt for clears to avoid sync readback
      sna: Assert that we never try to mix INPLACE / ASYNC hints for move-to-cpu
      sna: Avoid serialising on an move-to-cpu for an async operation
      sna: Revert use of a separate CAN_CREATE_SMALL flag
      sna: Add DBG for when we add the inplace hint
      sna: Fix computation of large object sizes to prevent overflow
      sna: Discard the batch if we are discarding the only buffer in it
      sna: Correct DBG to refer to the actual tiling mode forced
      sna: Restrict upload buffers to reduce sampler TLB misses
      2.20.18 release

Dave Airlie (2):
      intel: drop pointless error printf in the slave pixmap sync code.
      intel: fixup damage posting to be done correctly around slave pixmap

Matt Turner (1):
      sna: Rewrite __fls without dependence upon x86 assembly

Micka?l THOMAS (1):
      Set initial value for backlight_active_level

git tag: 2.20.18

http://xorg.freedesktop.org/archive/individual/driver/xf86-video-intel-2.20.18.tar.bz2
MD5:  56457bda125912b803f488f779460640  xf86-video-intel-2.20.18.tar.bz2
SHA1: ca30ed7d84a02b3ccdb9e47ee876809e406822b3  xf86-video-intel-2.20.18.tar.bz2
SHA256: f3daedf9571b04234053507940ba0a221abfcd294c3c350ff49eaf499b8437b5  xf86-video-intel-2.20.18.tar.bz2

http://xorg.freedesktop.org/archive/individual/driver/xf86-video-intel-2.20.18.tar.gz
MD5:  dd1e212394f489624807495b5540924e  xf86-video-intel-2.20.18.tar.gz
SHA1: 4a285a94968af73e972b15972f57b85dfec58bc7  xf86-video-intel-2.20.18.tar.gz
SHA256: d0ae1cd7aadd7ace87f4d702df7e583c6c331e7dc5e39b9f1e70ca21416a5978  xf86-video-intel-2.20.18.tar.gz

-- 
Chris Wilson, Intel Open Source Technology Centre
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 198 bytes
Desc: Digital signature
URL: <http://lists.freedesktop.org/archives/intel-gfx/attachments/20130116/bd9608c8/attachment-0001.pgp>