Thanks, Thomas > > We cannot 'always migrate to mappable in the fault handler'. Or at > least, this is not as trivial as it is to write in a sentence due to > the need to spill out other active objects, and all the usual > challenges with context synchronization etc. It is possible, perhaps > with a lot of care, but it is challenging to guarantee, easy to > break, and not needed for 99.9% of software. We are trying to > simplify our driver stack. > > If we need a special mechanism for debug, we should devise a special > mechanism, not throw out the general LMEM+SMEM requirement. Are there > any identified first-class clients that require such access, or is it > only debugging tools? > > If only debug, then why can't the tool use a copy engine submission > to access the data in place? Or perhaps a bespoke ioctl to access > this via the KMD (and kmd submitted copy-engine BB)? > > Thanks, > > Jon > > > -----Original Message----- > > From: Thomas Hellström <thomas.hellstrom@xxxxxxxxxxxxxxx> > > Sent: Thursday, March 17, 2022 2:35 AM > > To: Joonas Lahtinen <joonas.lahtinen@xxxxxxxxxxxxxxx>; Bloomfield, > > Jon > > <jon.bloomfield@xxxxxxxxx>; Intel Graphics Development <intel- > > gfx@xxxxxxxxxxxxxxxxxxxxx>; Auld, Matthew <matthew.auld@xxxxxxxxx>; > > C, > > Ramalingam <ramalingam.c@xxxxxxxxx> > > Subject: Re: Small bar recovery vs compressed content on DG2 > > > > On Thu, 2022-03-17 at 10:43 +0200, Joonas Lahtinen wrote: > > > Quoting Thomas Hellström (2022-03-16 09:25:16) > > > > Hi! > > > > > > > > Do we somehow need to clarify in the headers the semantics for > > > > this? > > > > > > > >  From my understanding when discussing the CCS migration series > > > > with > > > > Ram, the kernel will never do any resolving (compressing / > > > > decompressing) migrations or evictions which basically implies > > > > the > > > > following: > > > > > > > > *) Compressed data must have LMEM only placement, otherwise the > > GPU > > > > would read garbage if accessing from SMEM. > > > > > > This has always been the case, so it should be documented in the > > > uAPI > > > headers and kerneldocs. > > > > > > > *) Compressed data can't be assumed to be mappable by the CPU, > > > > because > > > > in order to ensure that on small BAR, the placement needs to be > > > > LMEM+SMEM. > > > > > > Not strictly true, as we could always migrate to the mappable > > > region > > > in > > > the CPU fault handler. Will need the same set of tricks as with > > > limited > > > mappable GGTT in past. > > > > In addition to Matt's reply: > > > > Yes, if there is sufficient space. I'm not sure we want to > > complicate > > this to migrate only part of the buffer to mappable on a fault > > basis? > > Otherwise this is likely to fail. > > > > One option is to allow cpu-mapping from SYSTEM like TTM is doing > > for > > evicted buffers, even if SYSTEM is not in the placement list, and > > then > > migrate back to LMEM for gpu access. > > > > But can user-space even interpret the compressed data when CPU- > > mapping? > > without access to the CCS metadata? > > > > > > > > > *) Neither can compressed data be part of a CAPTURE buffer, > > > > because > > > > that > > > > requires the data to be CPU-mappable. > > > > > > Especially this will be too big of a limitation which we can't > > > really > > > afford > > > when it comes to debugging. > > > > Same here WRT user-space interpretation. > > > > This will become especially tricky on small BAR, because either we > > need > > to fit all compressed buffers in the mappable portion, or be able > > to > > blit the contents of the capture buffers from within the fence > > signalling critical section, which will require a lot of work I > > guess. > > > > /Thomas > > > > > > > > > > Regards, Joonas > > > > > > > Are we (and user-mode drivers) OK with these restrictions, or > > > > do we > > > > need > > > > to rethink? > > > > > > > > Thanks, > > > > > > > > Thomas > > > > > > > > > > >