On Thu, Apr 5, 2018 at 3:31 PM, Hans de Goede <hdegoede@xxxxxxxxxx> wrote: > Hi, > > > On 05-04-18 15:26, Daniel Vetter wrote: >> >> On Thu, Apr 5, 2018 at 1:47 PM, Hans de Goede <hdegoede@xxxxxxxxxx> wrote: >>> >>> Hi, >>> >>> >>> On 05-04-18 09:14, Daniel Vetter wrote: >>>> >>>> >>>> On Fri, Mar 30, 2018 at 02:27:15PM +0200, Hans de Goede wrote: >>>>> >>>>> >>>>> Before this commit the WaSkipStolenMemoryFirstPage workaround code was >>>>> skipping the first 4k by passing 4096 as start of the address range >>>>> passed >>>>> to drm_mm_init(). This means that calling drm_mm_reserve_node() to try >>>>> and >>>>> reserve the firmware framebuffer so that we can inherit it would always >>>>> fail, as the firmware framebuffer starts at address 0. >>>>> >>>>> Commit d43537610470 ("drm/i915: skip the first 4k of stolen memory on >>>>> everything >= gen8") says in its commit message: "This is confirmed to >>>>> fix >>>>> Skylake screen flickering issues (probably caused by the fact that we >>>>> initialized a ring in the first page of stolen, but I didn't 100% >>>>> confirm >>>>> this theory)." >>>>> >>>>> Which suggests that it is safe to use the first page for a linear >>>>> framebuffer as the firmware is doing. >>>>> >>>>> This commit always passes 0 as start to drm_mm_init() and works around >>>>> WaSkipStolenMemoryFirstPage in i915_gem_stolen_insert_node_in_range() >>>>> by insuring the start address passed by to >>>>> drm_mm_insert_node_in_range() >>>>> is always 4k or more. All entry points to i915_gem_stolen.c go through >>>>> i915_gem_stolen_insert_node_in_range(), so that any newly allocated >>>>> objects such as ring-buffers will not be allocated in the first 4k. >>>>> >>>>> The one exception is i915_gem_object_create_stolen_for_preallocated() >>>>> which directly calls drm_mm_reserve_node() which now will be able to >>>>> use the first 4k. >>>>> >>>>> This fixes the i915 driver no longer being able to inherit the firmware >>>>> framebuffer on gen8+, which fixes the video output changing from the >>>>> vendor logo to a black screen as soon as the i915 driver is loaded >>>>> (on systems without fbcon). >>>>> >>>>> Signed-off-by: Hans de Goede <hdegoede@xxxxxxxxxx> >>>> >>>> >>>> >>>> I think this is worth a shot. The only explanation I can think of why >>>> the >>>> GOP could get away with this and still follow the w/a is if it doesn't >>>> have a 1:1 mapping between GGTT and stolen. >>> >>> >>> >>> My guess is that the GOP does not care about the w/a because the w/a >>> states that the command-streamers sometimes do a spurious flush (write) >>> to the first page, and the command-streamers are not used until much >>> later, so the GOP is probably ignoring the w/a all together. >>> >>> At least that is my theory. >> >> >> Atm we do not know whether the GOP actually implements the wa or not. >> That it doesn't care is just a conjecture, but we have no proof. On >> previous platforms the 1:1 mapping did hold, but there's no >> fundamental reason why it sitll does. >> >>>> Atm we hardcode that >>>> assumption in intel_alloc_initial_plane_obj by passing the base_aligned >>>> as >>>> both the stolen_offset and the gtt_offset (but it's only the gtt_offset >>>> really). And since we're not re-writing the ptes it's not noticeable. >>>> >>>> I think to decide whether this is the right approach we should >>>> double-check whether that 1:1 assumption really holds true: Either read >>>> back the ggtt ptes and check their addresses (but iirc on some platforms >>>> their write-only, readback doesn't work), or we also rewrite the ptes >>>> again for preallocated stuff, like when binding a normal object into the >>>> gtt. If either of these approaches confirms that those affected gen8+ >>>> machines still use the 1:1 mapping, then I'm happy to put my r-b on this >>>> patch. If not, well then we at least know what to fix: We need to read >>>> out >>>> the real stolen_offset, instead of making assumptions. >>> >>> >>> >>> I'm happy to give this a try on one or more affected machines, I can e.g. >>> try this on both a skylake desktop and a cherry-trail based tablet. >>> >>> But I'm clueless about the whole PTE stuff, so I'm going to need someone >>> to provide me with a patch following either approach. If readback of the >>> PTEs works on skylake / cherrytrail I guess that would be the best way. >>> >>> Note to test this I'm currently booting the kernel with the machine's >>> UEFI vendor logo still being displayed when efifb kicks in. I've added: >>> "fbcon=vc:2-62" to the kernel commandline to make fbcon not bind to >>> VC0 / VC1, so that the framebuffer contents stays intact, combined with >>> the patch we are discussing now, this makes the vendor logo stay on >>> the screen all the way till GDM loads, which looks rather nice actually >>> :) >>> >>> And on shutdown we fall back to the original framebuffer and the vendor >>> logo is back again. I cannot see any corruption in the display at either >>> boot or shutdown time. >> >> >> It shouldn't matter whether efifb takes over or not, it's still the >> GOP's framebuffer we're taking over. Different content for sure, logo >> is gone, but we only care about which pages we're using. > > > Right, I only mentioned this to explain that I'm not seeing any > (visible) corruption. > >> Wrt the code: >> - (Re)binding happens by calling i915_vma_bind, if you put a call to >> that at the end of the stolen_preallocated functions you should get >> that. Ofc needs the logo still there so you can see it jump/get >> mangled. >> - GGTT PTEs for gen8+ are written through e.g. gen8_ggttt_insert_page >> or gen8_ggtt_insert_entries (which takes the vma). Since we only care >> about the first entry a readq(dev_priv->ggtt->gsm) is all we really >> need. If it's all 0 then it's not working (since at least >> _PAGE_PRESENT should be set, plus the physical address of a page in >> stolen memory). > > > Ok, I will try to give this a shot, probably not before coming > Monday though. For the 2nd expirement we ofc also need the base physical address of stolen. /proc/iomem should dump that already, but if it's not there pls add a printk for that too. It's dev_priv->dsm.start, see i915_pages_create_for_stolen. >> I can do some patches for each of those, but a bit swamped. Also no >> gen8 handy for testing I think to make sure it works, so would take a >> few weeks probably. > > > I'll give this a shot myself first and attach the patch when I reply > with the testing results, so that you can verify that the patch works > as expected. Thanks a lot. I'll be travelling a bit later on next week, but I'll try to take a look. Cheers, Daniel > > Regards, > > Hans > > > >> >> Cheers, Daniel >>> >>> >>> >>>> >>>> Cheers, Daniel >>>>> >>>>> >>>>> --- >>>>> drivers/gpu/drm/i915/i915_gem_stolen.c | 15 ++++++--------- >>>>> 1 file changed, 6 insertions(+), 9 deletions(-) >>>>> >>>>> diff --git a/drivers/gpu/drm/i915/i915_gem_stolen.c >>>>> b/drivers/gpu/drm/i915/i915_gem_stolen.c >>>>> index af915d041281..ad949cc30928 100644 >>>>> --- a/drivers/gpu/drm/i915/i915_gem_stolen.c >>>>> +++ b/drivers/gpu/drm/i915/i915_gem_stolen.c >>>>> @@ -51,6 +51,10 @@ int i915_gem_stolen_insert_node_in_range(struct >>>>> drm_i915_private *dev_priv, >>>>> if (!drm_mm_initialized(&dev_priv->mm.stolen)) >>>>> return -ENODEV; >>>>> + /* WaSkipStolenMemoryFirstPage:bdw+ */ >>>>> + if (INTEL_GEN(dev_priv) >= 8 && start < 4096) >>>>> + start = 4096; >>>>> + >>>>> mutex_lock(&dev_priv->mm.stolen_lock); >>>>> ret = drm_mm_insert_node_in_range(&dev_priv->mm.stolen, node, >>>>> size, alignment, 0, >>>>> @@ -343,7 +347,6 @@ int i915_gem_init_stolen(struct drm_i915_private >>>>> *dev_priv) >>>>> { >>>>> resource_size_t reserved_base, stolen_top; >>>>> resource_size_t reserved_total, reserved_size; >>>>> - resource_size_t stolen_usable_start; >>>>> mutex_init(&dev_priv->mm.stolen_lock); >>>>> @@ -435,17 +438,11 @@ int i915_gem_init_stolen(struct >>>>> drm_i915_private >>>>> *dev_priv) >>>>> (u64)resource_size(&dev_priv->dsm) >> 10, >>>>> ((u64)resource_size(&dev_priv->dsm) - >>>>> reserved_total) >> 10); >>>>> - stolen_usable_start = 0; >>>>> - /* WaSkipStolenMemoryFirstPage:bdw+ */ >>>>> - if (INTEL_GEN(dev_priv) >= 8) >>>>> - stolen_usable_start = 4096; >>>>> - >>>>> dev_priv->stolen_usable_size = >>>>> - resource_size(&dev_priv->dsm) - reserved_total - >>>>> stolen_usable_start; >>>>> + resource_size(&dev_priv->dsm) - reserved_total; >>>>> /* Basic memrange allocator for stolen space. */ >>>>> - drm_mm_init(&dev_priv->mm.stolen, stolen_usable_start, >>>>> - dev_priv->stolen_usable_size); >>>>> + drm_mm_init(&dev_priv->mm.stolen, 0, >>>>> dev_priv->stolen_usable_size); >>>>> return 0; >>>>> } >>>>> -- >>>>> 2.17.0.rc2 >>>>> >>>>> _______________________________________________ >>>>> Intel-gfx mailing list >>>>> Intel-gfx@xxxxxxxxxxxxxxxxxxxxx >>>>> https://lists.freedesktop.org/mailman/listinfo/intel-gfx >>>> >>>> >>>> >>> >> >> >> > -- Daniel Vetter Software Engineer, Intel Corporation +41 (0) 79 365 57 48 - http://blog.ffwll.ch _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/intel-gfx