Re: [PATCH] RFC: omapdrm DRM/KMS driver for TI OMAP platforms

Rob Clark <rob.clark@xxxxxxxxxx> · Wed, 14 Sep 2011 14:42:29 -0500

On Wed, Sep 14, 2011 at 12:44 AM, Inki Dae <inki.dae@xxxxxxxxxxx> wrote:
> Hello Rob.
> Sorry for being late. here was a national holiday.
>
>
>> -----Original Message-----
>> From: robdclark@xxxxxxxxx [mailto:robdclark@xxxxxxxxx] On Behalf Of Rob
>> Clark
>> Sent: Thursday, September 08, 2011 3:44 AM
>> To: Inki Dae
>> Cc: linaro-dev@xxxxxxxxxxxxxxxx; dri-devel@xxxxxxxxxxxxxxxxxxxxx
>> Subject: Re: [PATCH] RFC: omapdrm DRM/KMS driver for TI OMAP platforms
>>
>> On Wed, Sep 7, 2011 at 1:00 AM, Inki Dae <inki.dae@xxxxxxxxxxx> wrote:
>> > Hello, Rob.
>> >
>> [snip]
>> >> >> +static void page_flip_cb(void *arg)
>> >> >> +{
>> >> >> +     struct drm_crtc *crtc = arg;
>> >> >> +     struct drm_device *dev = crtc->dev;
>> >> >> +     struct omap_crtc *omap_crtc = to_omap_crtc(crtc);
>> >> >> +     struct drm_pending_vblank_event *event = omap_crtc->event;
>> >> >> +     struct timeval now;
>> >> >> +     unsigned long flags;
>> >> >> +
>> >> >> +     WARN_ON(!event);
>> >> >> +
>> >> >> +     omap_crtc->event = NULL;
>> >> >> +
>> >> >> +     update_scanout(crtc);
>> >> >> +     commit(crtc);
>> >> >> +
>> >> >> +     /* wakeup userspace */
>> >> >> +     // TODO: this should happen *after* flip.. somehow..
>> >> >> +     if (event) {
>> >> >> +             spin_lock_irqsave(&dev->event_lock, flags);
>> >> >> +             event->event.sequence =
>> >> >> +                             drm_vblank_count_and_time(dev,
>> >> > omap_crtc->id,
>> >> >> &now);
>> >> >> +             event->event.tv_sec = now.tv_sec;
>> >> >> +             event->event.tv_usec = now.tv_usec;
>> >> >> +             list_add_tail(&event->base.link,
>> >> >> +                             &event->base.file_priv->event_list);
>> >> >> +
>> > wake_up_interruptible(&event->base.file_priv->event_wait);
>> >> >> +             spin_unlock_irqrestore(&dev->event_lock, flags);
>> >> >> +     }
>> >> >
>> >> > How about moving codes above into interrupt handler for vblank?
>> >> >  maybe there
>> >>
>> >> I should mention a couple of things:
>> >>
>> >> 1) drm vblank stuff isn't really used currently.. it is actually DSS2
>> >> layer that is registering for the display related interrupt(s).  I'm
>> >> not really sure how to handle this best.  I suppose the DRM driver
>> >> could *also* register for these interrupts, but that seems a bit odd..
>> >>
>> >
>> > DRM Framework supports only one interrupt handler. this issue should be
>> > resolved. and currently I made our driver to use its own request_irq,
>> not
>> > DRM Framework side. we only sets drm_device->irq_enabled to 1 and
>> interrupt
>> > handler is registered at display controller or hdmi driver respectively.
>> but
>> > I'm not sure that this way is best so I will look over more. Anyway
>> current
>> > drm framework should be fixed to be considered with multiple irq.
>>
>> Or perhaps even callbacks (some other driver handling the irq's directly)?
>>
>
> Yes.
>
>> >> Also, I guess it is also worth mentioning.. when it comes to vblank,
>> >> there are two fundamentally different sorts of displays we deal with.
>> >> Something like DVI/HDMI/etc which need constant refreshing.  In these
>> >> cases we constantly scan-out the buffer until the next page
>> >> flip+vsync.  And smart panels, where they get scanned out once and
>> >> then DSS is no longer reading the scanout buffer until we manually
>> >> trigger an update.
>> >>
>> >
>> > Is the Smart panel CPU interface based lcd panel that has its own
>> > framebuffer internally.?
>>
>> yes
>>
>
> CPU Interface based lcd panel needs manual updating once all contents of
> internal framebuffer are transferred to panel. thus, the Smart Panel(cpu
> interface based lcd panel) has advantage of power consumption with
> user-requested update. this means application should trigger display
> controller to be transferred to lcd panel at framedone signal of lcd panel
> because only the application is aware of this moment. this way would be
> getting confused, especially on general os, such as linux system, based
> platform so in case of using Smart Panel, I think we should make the panel
> driver to be worked like RGB interface.
>
> - framedone interrupt handler of the panel should be placed in the driver.
> - at booting time, display controller driver triggers just one time
> - the panel driver triggers every time framedone interrupt handler is
> called.
>
> and the order above would be repeated like RGB interface. thus I think we
> could deal with the Smart Panel like RGB way.

fwiw, I saw this:

http://www.engadget.com/2011/09/14/idts-power-saving-panel-self-refresh-tech-coming-to-laptops-ul/

seems like similar concept coming to PC's..

>> [snip]
>> >>
>> >> The main reason for the page-flip cb is actually not vsync
>> >> synchronization, but rather synchronizing with other hw blocks that
>> >> might be rendering to the buffer..  the page flip can be submitted
>> >> from userspace while some other hw block (3d, 2d, etc) is still
>> >> DMA'ing to the buffer.  The sync-obj is intended to give a way to
>> >> defer the (in this case) page flip until other hw blocks are done
>> >> writing to the buffer.
>> >
>> > I thought page-flip is used to change buffer register value of display
>> > controller into another one like the Pan Display feature of linux
>> > framebuffer. actually page flip interface of libdrm library,
>> > page_flip_handler, is called with new framebuffer id that has its own
>> > buffer. and then the controller using current crtc would be set to
>> buffer
>> > address of new framebuffer. and I understood that for other blocks such
>> as
>> > 2d/3d accelerators, rendering synchronization you already mentioned
>> above,
>> > is not depend on page flip action. It’s a sequence with my understanding
>> > below.
>> >
>> > 1. allocate a new buffer through drm interface such as DUMB_* or
>> SPECIFIC
>> > IOCTLs.
>> > 2. render something on this buffer through DRM interfaces that handle
>> 2D/3D
>> > Accelerators and also it would use their own interfaces for
>> synchronization.
>> > 3. allocate a new framebuffer and attach the buffer to this framebuffer.
>> > 4. call page flip to change current framebuffer to the framebuffer above
>> at
>> > vblank moment. at this time, buffer address of the framebuffer would be
>> set
>> > to a controller.
>> >
>> > finally, we can see some image rendered on diplay. thus, I think page
>> flip
>> > and rendering synchronization have no any relationship. if I missed any
>> > points, please give me your comments.
>>
>> Well, I guess it depends on whether synchronization of the 3d/2d
>> render is done in userspace, or whether you just submit all the render
>> commands and then immediately submit the page-flip.
>>
>> The actual page-flip should be a fairly light operation (writing a few
>> registers) and could be done directly from some sort of
>> render-complete interrupt from 2d/3d core.
>>
>
> In case of 3d rendering, I understood like below.
> - 3d application requests 3d rendering through OpenGL APIs
> - the requests would be packed as command in a command buffer kernel driver
> is aware of.
> - kernel driver gets the command buffer and unpacks it.
> - kernel driver renders geometries using the commands unpacked on the
> specific memory address(destination buffer).
> - while rendered, this application would be blocked.
> - if render-complete, the application requests swap buffer to change current
> buffer to rendered buffer.

well, really not kernel driver rendering or dealing with the command
stream.  But instead the 3d core / gpu.. which you could think of as a
specialized cpu running asynchronously..

Part of the command-stream that the gpu is chewing thru is a command
to flip (or perhaps generate an interrupt back to cpu to flip..
depending on how tightly integrated your display and gpu are).

So when glSwapBuffers() returns and application proceeds to start
working on frame n+1, the gpu might still be rendering frame n and the
display still showing frame n-1.  So the whole process is pipelined.
(Or at least can be.)  You'd normally not want to block application
unnecessarily, because this is a sort of pipeline stall.

> If vsync feature was enabled, it would wait for vsync at last step above
> before swapping buffers(page flip). thus I thought page-flip should be
> requested by application. but in your case, it appears that page flip could
> be requested at render context of 3d driver(kernel side) also. I'm not sure
> that this is a reasonable way. if there is any missing points, please give
> me your comments.

For "flipping" on a manual update display, you only need to wait if
you haven't yet got the frame-done from previous flip.  Frame-done irq
for manual update display is also not exactly like a vsync irq..
because the gpu can start rendering to that buffer again without the
user seeing tearing.. you can think of the framebuffer in the display
as an extra buffer for your flipchain.  Maybe that is just an
implementation detail.. not sure if it matters from a framework
perspective.

Other thing to mention.. any sort of time calculations based on the
assumption that vsync comes at a certain fixed rate (for a given set
of monitor timings) is bogus.

BR,
-R

>> [snip]
>> >> >> +int omap_framebuffer_get_buffer(struct drm_framebuffer *fb, int x,
>> int
>> >> y,
>> >> >> +             void **vaddr, unsigned long *paddr, int *screen_width)
>> >> >> +{
>> >> >> +     struct omap_framebuffer *omap_fb = to_omap_framebuffer(fb);
>> >> >> +     int bpp = fb->depth / 8;
>> >> >> +     unsigned long offset;
>> >> >> +
>> >> >> +     offset = (x * bpp) + (y * fb->pitch);
>> >> >> +
>> >> >> +     if (vaddr) {
>> >> >> +             if (!omap_fb->vaddr) {
>> >> >> +                     omap_fb->vaddr = ioremap_wc(omap_fb->paddr,
>> >> omap_fb-
>> >> >> >size);
>> >> >> +             }
>> >> >> +             *vaddr = omap_fb->vaddr + offset;
>> >> >> +     }
>> >> >
>> >> > Did you use ioremap_wc() to map physical memory(reserved memory) that
>> >> kernel
>> >> > doesn't aware of to kernel space.? if not so, the memory region
>> mapped
>> >> to
>> >> > kernel space as 1:1,  this way would be faced with duplicated cache
>> >> > attribute issue mentioned by Russel King before. 1:1 mapping region
>> is
>> >> > mapped to kernel space with cachable attribute.
>> >>
>> >> Today the carveout memory does not have a kernel virtual mapping.  So
>> >> we are ok.  And I think this should still be the case w/ CMA.
>> >>
>
> "the carveout memory does not have a kernel virtual mapping." -> this means
> carveout memory has no any page mapping such as 1 level page or 2 level page
> mapping.?
>
>> >
>> > I wonder what is the carveout and sacnout memory. carvout is physically
>> non
>> > continuous memory and scanout is physically continuous memory?
>>
>> today, scanout buffers are required to be contiguous.  There is the
>> possibility to remap non-contiguous buffers in TILER/DMM (which you
>> can think of as a sort of IOMMU).  Although adding support for this is
>> still on my TODO list.
>>
>> So today, carveout is used to allocate scanout buffers to ensure
>> physically contiguous.
>>
>> [snip]
>> >> >
>> >> > Remove device
>  specific functions from linux/include/linux/omap_drm.h
>> and
>> >> > move them to driver folder. I was told that include/linux/*.h file
>> >> should
>> >> > contain only interfaces for user process from Dave.
>> >>
>> >>
>> >> fwiw, the functions in this header are already only ones used by other
>> >> kernel drivers.. everything else internal to omapdrm is already in
>> >> omap_drv.h.  I'll remove these for now (see discussion about plugin
>> >> API), although when it is re-introduced they need to be in a header
>> >> accessible from other drivers.  Although should maybe still be split
>> >> from what is needed by userspace?  (Something like omapdrm_plugin.h?)
>> >>
>> >>
>> >
>> > I'm afraid I don't understand what you mean, "Although should maybe
>> still be
>> > split from what is needed by userspace?  (Something like
>> > omapdrm_plugin.h?)", could you please clarify that again?
>>
>> I mean one header file for userspace facing API, and a separate one
>> for kernel facing API used by other drivers/modules..
>>
>
> Yes, At least, I think the header file in include/drm/ should include only
> userspace APIs and structures.
>
>> (In the first pass, that second header would not exist because I'll
>> remove the plugin API until we have one w/ open userspace)
>>
>> BR,
>> -R
>
> _______________________________________________
> dri-devel mailing list
> dri-devel@xxxxxxxxxxxxxxxxxxxxx
> http://lists.freedesktop.org/mailman/listinfo/dri-devel
>
_______________________________________________
dri-devel mailing list
dri-devel@xxxxxxxxxxxxxxxxxxxxx
http://lists.freedesktop.org/mailman/listinfo/dri-devel