On Sat, Apr 07, 2018 at 11:34:57AM +0200, Hans de Goede wrote: > Hi, > > On 06-04-18 18:06, Ville Syrjälä wrote: > > On Thu, Apr 05, 2018 at 01:37:31PM +0200, Hans de Goede wrote: > > > Hi, > > > > > > On 04-04-18 22:49, Ville Syrjälä wrote: > > > > On Wed, Apr 04, 2018 at 10:06:29PM +0200, Hans de Goede wrote: > > > > > Hi, > > > > > > > > > > On 04-04-18 17:50, Ville Syrjälä wrote: > > > > > > On Fri, Mar 30, 2018 at 04:26:53PM +0200, Hans de Goede wrote: > > > > > > > Hi, > > > > > > > > > > > > > > On 30-03-18 15:25, Hans de Goede wrote: > > > > > > > > Hi, > > > > > > > > > > > > > > > > On 30-03-18 14:44, Chris Wilson wrote: > > > > > > > > > Quoting Hans de Goede (2018-03-30 13:37:40) > > > > > > > > > > Hi, > > > > > > > > > > > > > > > > > > > > On 30-03-18 14:30, Chris Wilson wrote: > > > > > > > > > > > Quoting Hans de Goede (2018-03-30 13:27:15) > > > > > > > > > > > > Before this commit the WaSkipStolenMemoryFirstPage workaround code was > > > > > > > > > > > > skipping the first 4k by passing 4096 as start of the address range passed > > > > > > > > > > > > to drm_mm_init(). This means that calling drm_mm_reserve_node() to try and > > > > > > > > > > > > reserve the firmware framebuffer so that we can inherit it would always > > > > > > > > > > > > fail, as the firmware framebuffer starts at address 0. > > > > > > > > > > > > > > > > > > > > > > > > Commit d43537610470 ("drm/i915: skip the first 4k of stolen memory on > > > > > > > > > > > > everything >= gen8") says in its commit message: "This is confirmed to fix > > > > > > > > > > > > Skylake screen flickering issues (probably caused by the fact that we > > > > > > > > > > > > initialized a ring in the first page of stolen, but I didn't 100% confirm > > > > > > > > > > > > this theory)." > > > > > > > > > > > > > > > > > > > > > > > > Which suggests that it is safe to use the first page for a linear > > > > > > > > > > > > framebuffer as the firmware is doing. > > > > > > > > > > > > > > > > > > > > > > > > This commit always passes 0 as start to drm_mm_init() and works around > > > > > > > > > > > > WaSkipStolenMemoryFirstPage in i915_gem_stolen_insert_node_in_range() > > > > > > > > > > > > by insuring the start address passed by to drm_mm_insert_node_in_range() > > > > > > > > > > > > is always 4k or more. All entry points to i915_gem_stolen.c go through > > > > > > > > > > > > i915_gem_stolen_insert_node_in_range(), so that any newly allocated > > > > > > > > > > > > objects such as ring-buffers will not be allocated in the first 4k. > > > > > > > > > > > > > > > > > > > > > > > > The one exception is i915_gem_object_create_stolen_for_preallocated() > > > > > > > > > > > > which directly calls drm_mm_reserve_node() which now will be able to > > > > > > > > > > > > use the first 4k. > > > > > > > > > > > > > > > > > > > > > > > > This fixes the i915 driver no longer being able to inherit the firmware > > > > > > > > > > > > framebuffer on gen8+, which fixes the video output changing from the > > > > > > > > > > > > vendor logo to a black screen as soon as the i915 driver is loaded > > > > > > > > > > > > (on systems without fbcon). > > > > > > > > > > > > > > > > > > > > > > We've been told by the HW guys not to use the first page. (That's my > > > > > > > > > > > understanding from every time this gets questioned.) > > > > > > > > > > > > > > > > > > > > Yet the GOP is happily using the first page. I think we may need to make > > > > > > > > > > a difference here between the GPU not using the first page and the > > > > > > > > > > display engine/pipeline not using the first page. Note that my patch > > > > > > > > > > only influences the inheriting of the initial framebuffer as allocated > > > > > > > > > > by the GOP. It does not influence any other allocations from the > > > > > > > > > > reserved range, those will still all avoid the first page. > > > > > > > > > > > > > > > > > > > > Without this patch fastboot / flickerfree support is essentially broken > > > > > > > > > > on any gen8+ hardware given that one of the goals of atomic is to be > > > > > > > > > > able to do flickerfree transitions I think that this warrants a closer > > > > > > > > > > look then just simply saying never use the first page. > > > > > > > > > > > > > > > > > > The concern is what else (i.e. nothing that we allocated ourselves) that > > > > > > > > > may be in the first page... > > > > > > > > > > > > > > > > Given that the GOP has put its framebuffer there at least at boot there > > > > > > > > is nothing there, otherwise it would show up on the display. > > > > > > > > > > > > > > > > We have a whole bunch of code to inherit the BIOS fb in intel_display.c > > > > > > > > and AFAIK that code is there because this inheriting the BIOS fb is > > > > > > > > deemed an important feature. So I'm not happy at all with the handwavy > > > > > > > > best to not touch it answer I'm getting to this patch. > > > > > > > > > > > > > > > > Unless there are some clear answer from the hardware folks which specifically > > > > > > > > say we cannot put a framebuffer there (and then why is the GOP doing it?) > > > > > > > > then I believe that the best way forward here is to get various people to > > > > > > > > test with this patch and the best way to do that is probably to put it > > > > > > > > in next. Note I deliberately did not add a Cc stable. > > > > > > > > > > > > > > To elaborate on this, the excluding of the first 4k of the stolen memory > > > > > > > region causes intel_alloc_initial_plane_obj() from intel_display.c to fail, > > > > > > > which in turn causes intel_find_initial_plane_obj() to call > > > > > > > intel_plane_disable_noatomic(intel_crtc, intel_plane); which temporarily > > > > > > > completely turns off the display which leads to a very ugly flickering > > > > > > > of the display at boot (as well as replacing the vendor logo with a > > > > > > > black screen). > > > > > > > > > > > > > > I think we can all agree that this behavior is undesirable and even a > > > > > > > regression in behavior caused by the fix to exclude the first 4k. > > > > > > > > > > > > > > Chris, if my patch is not an acceptable way to fix this, then how do you > > > > > > > suggest that we fix this? > > > > > > > > > > > > > > Digging a bit deeper I found this: > > > > > > > > > > > > > > https://01.org/sites/default/files/documentation/intel-gfx-prm-osrc-kbl-vol16-workarounds.pdf > > > > > > > > > > > > > > Which says: > > > > > > > > > > > > > > "WaSkipStolenMemoryFirstPage: > > > > > > > > > > > > > > WA to skip the first page of stolen > > > > > > > memory due to sporadic HW write on *CS Idle" > > > > > > > > > > > > > > And also about FBC: > > > > > > > > > > > > > > "First line of FBC getting corrupted when FBC > > > > > > > compressed frame buffer offset is programmed to > > > > > > > zero. Command streamers are doing flush writes to > > > > > > > base of stolen. > > > > > > > WA: New restriction to program FBC compressed > > > > > > > frame buffer offset to at least 4KB." > > > > > > > > > > > > > > So using the first 4kB for the *framebuffer* as done by the GOP will > > > > > > > not cause any major problems (freezes, hangs, etc.), and commit > > > > > > > d43537610470 ("drm/i915: skip the first 4k of stolen memory on > > > > > > > everything >= gen8") was correct in deducing that the problem was > > > > > > > likely that some *vital* information was being stored i the first 4k > > > > > > > and that go overwritten. > > > > > > > > > > > > > > But the contents of the (first lines of) the framebuffer may become > > > > > > > corrupted once we actually start using the command-streamers, which > > > > > > > is still very much not wanted. > > > > > > > > > > > > > > In practice Xorg or Wayland will likely have setup another framebuffer > > > > > > > by the time the command-streamers will start to get used. > > > > > > > > > > > > > > Alternatively we could start with inheriting the BIOS framebuffer > > > > > > > (as my patch allows) so that we don't get the flicker and then soon > > > > > > > afterwards atomically transit to a new framebuffer (which should > > > > > > > contain a copy of the BIOS fb contents) at a different location. > > > > > > > > > > > > What I suggested long ago was to copy just the first page and adjust the > > > > > > sg list. But I'm not sure if our stolen gem code would be happy with an > > > > > > sg list with two entries instead of one. > > > > > > > > > > But that would still require an atomic-modeset to install the new sg-list, > > > > > right? > > > > > > > > Perhaps not. Not sure if the pte update would be atomic enough to just > > > > change it underneath the display engine without ill effects, and then > > > > do the equivalent of a page flip to invalidate the TLBs. > > > > > > > > > Then we might just as well simply alloc a new fb and copy the > > > > > contents over, or are you worried that with say a 4k fb that takes too > > > > > much time? FWIW I can see how the single memcpy this involves will take > > > > > some time, but I don't take it will take so long as to be a problem. > > > > > > > > Mainly just a question of keeping it in stolen. > > > > > > Ah I see. > > > > > > > Assuming we want to keep > > > > things in stolen, which is a matter of some debate as FBC needs stolen > > > > and people might not be happy if it's all taken up by fbdev. > > > > > > > > > > > > > > Anyways I could use some help with implementing either solution as I'm > > > > > not familiar with the involved parts of the code. I will happily test > > > > > a patch for this. Keep in mind that for this to work my original patch > > > > > will also be necessary so that the initial takeover of the firmware > > > > > fb will work. > > > > > > > > I guess the trickiest part would be getting both the old and new > > > > location of the page mapped in the ggtt at the same time. Sadly you're > > > > not allowed to access stolen directly. So I suppose this part would > > > > involve some fairly low level frobbing of the ggtt ptes and a > > > > manual ioremap() of the matching ranges of the aperture. > > > > > > Hmm, you're talking about what needs to be done to copy the contents here, > > > right? > > > > Yeah. > > So thinking more about this, for the old / BIOS framebuffer there > already is a mapping setup for efifb use and for the new one we > should also already set up a mapping when we create it, so I think > we can really just do a memcpy here after creating a new framebuffer? > > Anyways I will run the tests Daniel asked me to run first. Yeah let's not invent solutions for problems we might not even have :-) -Daniel -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/intel-gfx