[no subject]

**Date** **Thread**



Thanks,
Thomas


> 
> We cannot 'always migrate to mappable in the fault handler'. Or at
> least, this is not as trivial as it is to write in a sentence due to
> the need to spill out other active objects, and all the usual
> challenges with context synchronization etc. It is possible, perhaps
> with a lot of care, but it is challenging to guarantee, easy to
> break, and not needed for 99.9% of software. We are trying to
> simplify our driver stack.
> 
> If we need a special mechanism for debug, we should devise a special
> mechanism, not throw out the general LMEM+SMEM requirement. Are there
> any identified first-class clients that require such access, or is it
> only debugging tools?
> 
> If only debug, then why can't the tool use a copy engine submission
> to access the data in place? Or perhaps a bespoke ioctl to access
> this via the KMD (and kmd submitted copy-engine BB)?
> 
> Thanks,
> 
> Jon
> 
> > -----Original Message-----
> > From: Thomas HellstrÃ¶m <thomas.hellstrom@xxxxxxxxxxxxxxx>
> > Sent: Thursday, March 17, 2022 2:35 AM
> > To: Joonas Lahtinen <joonas.lahtinen@xxxxxxxxxxxxxxx>; Bloomfield,
> > Jon
> > <jon.bloomfield@xxxxxxxxx>; Intel Graphics Development <intel-
> > gfx@xxxxxxxxxxxxxxxxxxxxx>; Auld, Matthew <matthew.auld@xxxxxxxxx>;
> > C,
> > Ramalingam <ramalingam.c@xxxxxxxxx>
> > Subject: Re: Small bar recovery vs compressed content on DG2
> > 
> > On Thu, 2022-03-17 at 10:43 +0200, Joonas Lahtinen wrote:
> > > Quoting Thomas HellstrÃ¶m (2022-03-16 09:25:16)
> > > > Hi!
> > > > 
> > > > Do we somehow need to clarify in the headers the semantics for
> > > > this?
> > > > 
> > > > Â From my understanding when discussing the CCS migration series
> > > > with
> > > > Ram, the kernel will never do any resolving (compressing /
> > > > decompressing) migrations or evictions which basically implies
> > > > the
> > > > following:
> > > > 
> > > > *) Compressed data must have LMEM only placement, otherwise the
> > GPU
> > > > would read garbage if accessing from SMEM.
> > > 
> > > This has always been the case, so it should be documented in the
> > > uAPI
> > > headers and kerneldocs.
> > > 
> > > > *) Compressed data can't be assumed to be mappable by the CPU,
> > > > because
> > > > in order to ensure that on small BAR, the placement needs to be
> > > > LMEM+SMEM.
> > > 
> > > Not strictly true, as we could always migrate to the mappable
> > > region
> > > in
> > > the CPU fault handler. Will need the same set of tricks as with
> > > limited
> > > mappable GGTT in past.
> > 
> > In addition to Matt's reply:
> > 
> > Yes, if there is sufficient space. I'm not sure we want to
> > complicate
> > this to migrate only part of the buffer to mappable on a fault
> > basis?
> > Otherwise this is likely to fail.
> > 
> > One option is to allow cpu-mapping from SYSTEM like TTM is doing
> > for
> > evicted buffers, even if SYSTEM is not in the placement list, and
> > then
> > migrate back to LMEM for gpu access.
> > 
> > But can user-space even interpret the compressed data when CPU-
> > mapping?
> > without access to the CCS metadata?
> > 
> > > 
> > > > *) Neither can compressed data be part of a CAPTURE buffer,
> > > > because
> > > > that
> > > > requires the data to be CPU-mappable.
> > > 
> > > Especially this will be too big of a limitation which we can't
> > > really
> > > afford
> > > when it comes to debugging.
> > 
> > Same here WRT user-space interpretation.
> > 
> > This will become especially tricky on small BAR, because either we
> > need
> > to fit all compressed buffers in the mappable portion, or be able
> > to
> > blit the contents of the capture buffers from within the fence
> > signalling critical section, which will require a lot of work I
> > guess.
> > 
> > /Thomas
> > 
> > 
> > > 
> > > Regards, Joonas
> > > 
> > > > Are we (and user-mode drivers) OK with these restrictions, or
> > > > do we
> > > > need
> > > > to rethink?
> > > > 
> > > > Thanks,
> > > > 
> > > > Thomas
> > > > 
> > > > 
> > 
>