On Thu, Feb 08, 2018 at 11:53:30AM +0200, Oleksandr Andrushchenko wrote: > Hello, Ville! > > On 02/06/2018 06:04 PM, Ville Syrjälä wrote: > > On Tue, Feb 06, 2018 at 11:59:37AM +0200, Oleksandr Andrushchenko wrote: > >> Hello, Ville! > >> > >> Thank you very much for such a comprehensive answer. > >> > >> Please see inline > >> > >> > >> On 02/05/2018 06:47 PM, Ville Syrjälä wrote: > >>> On Mon, Feb 05, 2018 at 06:03:25PM +0200, Oleksandr Andrushchenko wrote: > >>>> Hello, > >>>> > >>>> > >>>> I have a DRM driver which implements display protocol for Xen [1] > >>>> and this protocol has a dedicated XENDISPL_OP_PG_FLIP request which > >>>> tells the other end that some display buffer needs to be displayed, > >>>> e.g. it is issued by the frontend DRM driver to the corresponding > >>>> backend when the former wants to display a buffer. > >>>> Two notes: > >>>> 1. Communication between two remote parties can obviously fail. > >>>> 2. At the moment only primary plane is supported, so we can think of > >>>> the display buffer as of CRTC's primary fb. > >>>> > >>>> There are two APIs available for user-space to initiate a page-flip > >>>> (please correct me if I am wrong here): legacy via DRM_IOCTL_MODE_PAGE_FLIP > >>>> and more recent via DRM_IOCTL_MODE_ATOMIC. My current implementation > >>>> (which is 4.9 based) uses drm_crtc_funcs.page_flip callback to send > >>>> XENDISPL_OP_PG_FLIP request to the backend, so if communication fails > >>>> I can return an error code, so the DRM core knows that page flip > >>>> cannot be done. > >>>> > >>>> But now I am about to enable page flipping via DRM_IOCTL_MODE_ATOMIC for > >>>> which this happens without DRM_IOCTL_MODE_PAGE_FLIP, but via .atomic_check + > >>>> .atomic_flush callbacks. > >>>> > >>>> The solution I have in mind is that I move the XENDISPL_OP_PG_FLIP request > >>>> to the .atomic_flush callback which seems to be the right place to serve > >>>> both DRM_IOCTL_MODE_PAGE_FLIP and DRM_IOCTL_MODE_ATOMIC (is this?). > >>>> > >>>> The questions I have with this are: > >>>> > >>>> 1. What is the framebuffer I can send to the backend? > >>>> I assume I can use crtc->primary->fb from > >>>> drm_crtc_helper_funcs.atomic_flush, > >>>> is this assumption correct? > >>> Planes are explicit in the atomic world, so you should just deal with > >>> the plane if and when it's part of the drm_atomic_state. Look at eg. > >>> drm_atomic_helper_commit_planes() how to figure out what's in the > >>> state. > >> Yes, this is clear. The question was about the frame buffer > >> I get in drm_crtc_helper_funcs.atomic_flush which only has > >> old_crtc_state, so I assumed I can use crtc->primary->fb there. > >> Or you mean I have to dig into old_crtc_state->state to find > >> out if there is a plane and if it has a frambuffer and if so > >> use it instead of crtc->primary->fb? > > You will get the plane explicitly as part of the drm_atomic_state. > > Normally you shouldn't need to go find it via other means. > > > > Oh, and never use the plane->fb etc. pointers in an atomic driver. > > Those are for legacy purposes only. Atomic drivers should only ever > > deal with proper state objects. > I am using fb from corresponding state now, thank you > > > >>> And I guess you're already planning on using that helper since > >>> you mention .atomic_flush(). > >> Not directly, but via drm_mode_config_funcs.atomic_commit > > .atomic_commit() is a hook the driver has to provide. Most drivers use > > the helper for it, which in turn requires the driver to provide other > > hooks such as .atomic_flush(). That is what you appear to be doing. > you are correct > >>> One should really think of the drm_atomic_state as more of a transaction > >>> rather than as a state (since not all objects need be part of it). > >> Thank you > >>>> 2. Is the check (either to send or not) for crtc->primary->fb != NULL > >>>> enough? > >>>> I assume there are other cases when .atomic_flush gets called, > >>>> so is there a way I can filter out unneeded cases? E.g. those which > >>>> are not for page flipping? > >>> Each object taking part in the transaction will have an associated > >>> new state and old state. > >> As I replied above I only have old state in .atomic_flush > > Oh right. That way of passing the old state only dates back to earlier > > days of atomic. To find the new state what you should do these days > > is something like: > > > > foo(struct drm_crtc *crtc, struct drm_crtc_state *old_state) > > { > > struct drm_atomic_state *state = old_state->state; > > struct drm_crtc_state *new_state = > > drm_atomic_get_new_crtc_state(state, crtc); > > ... > > > > The old way was to use the crtc->state pointer as either the old > > or new state depending on where you are in the commit sequence. > > But that wasn't very good so Maarten changed things so that we > > now have explicit old and new states for each object tracked in > > the drm_atomic_state. > > > > I think what we should really do is change most of the hooks to > > pass crtc+drm_atomic_state instead, and let each function grab > > the old and/or new crtc state from the drm_atomic_state as needed. > > I believe that would avoid confusing people with just the old or > > new state. > That would be great > >>> You can compare the two to figure out what > >>> changed, and in addition there may already some flags in the base > >>> class for the state that indicate what may have changed (eg. > >>> drm_crtc_state::mode_changed etc.). Such flags may be set by the > >>> core to help figure out what needs doing. > >> Yes, thank you > >>>> 3. If communication with the backend fails there is no way for me > >>>> to tell the DRM core about this as .atomic_flush is called after > >>>> "the point of no return". Do you think I can remember the error > >>>> code and report it next time .atomic_check is called? So, other page > >>>> flips will not run on broken connection, for example. > >>> Do you know when it might fail? > >> Not really, this is a software protocol to talk from > >> the frontend para-virtual DRM driver to its backend > >> in other Xen domain > > I see. Well, without evidence to the contrary I'd probably just assume > > it'll never fail. > I am not sure I can guarantee this > > That avoids complicating the code with potentially > > useless logic. > Unfortunately, I'll have to put something in > > Hammering it with something like igt for a while might > > serve as a good way to test that assumption. Not sure how many tests in > > igt currently run on non-i915 though. > The first test suite I tried was IGT indeed, but unfortunately > it is coupled with i915 too much, so only very basic tests > could be run. Then I found [1] which can be easily extended > without much efforts, so I stick to fork of it (hope to contribute > there later, for example, by adding ION tests) > > > >>> If so you should implement the appropriate > >>> checks in .atomic_check() etc. to try and make sure it never happens > >>> (barring a total hardware failure for example). > >>> > >>> Generally what we (i915) do is try to check everything up front, but if > >>> an unexpected error does happen later we just muddle through and log an > >>> error. > >>> > >>> For us I think the most common late failure is DP link training failure. > >>> That we can't fully check up front since it depends on external factors: > >>> sink, dongles, cabling, etc. For those we added a new connector property > >>> to signal to userspace that the link is having issues, allowing > >>> userspace to reconfigure things such that we lower the link bandwidth > >>> requirements. > >> I cannot do that this way, because the driver has to run > >> seamlessly for user-space, no specific utilities are > >> expected to run. > > There must be something running or you never get anything on the screen. > > > Yes, sorry for not being precise: I meant that nothing > driver specific, e.g. which knows internals of the driver. > >>> The link_status mechanism could perhaps be used to to work around other > >>> "late errors". But in general if you want to somehow fix that error you > >>> have to know what caused it, no? > >> That is correct, so I will print an error message on > >> page flip error so user has at list a clue on what's > >> wrong > >>> So if you just get a random error for > >>> seemingly no reason there's very little you can do apart from blindly > >>> retrying and logging an error. For the DP case the fallback mechanism is > >>> pretty clear: reduce link rate and/or number of lanes. > >>> > >>> As for signalling the error in the next ioctl call, that could be done > >>> I suppose. > >> I tried that approach and it seems to work. > >>> But again the question is how would userspace know what > >>> (if anything) it can do to fix the problem? > >> Well, this would be seen as just an error to user-space > >> and unfortunately if it is not prepared to deal with then it will > >> close. Not sure I can do anything smart about it > >>>> 4. As per my understanding I cannot put XENDISPL_OP_PG_FLIP request > >>>> into .atomic_check as the check doesn't guarantee that .atomic_flush > >>>> will follow. Is this correct or there is a more neat solution exists? > >>> Thou shalt not touch the hardware until .atomic_commit(). Note that > >>> .atomic_commit() itself is still allowed to fail, but only if you > >>> can fail atomically. > >>> > >>> The .atomic_flush() & co. helper stuff is designed in a way that > >>> expects no failures at that late stage of the commit. If that > >>> doesn't suit your hardware design then you may opt to not use the > >>> helper. > >>> > >>> Also note that non-blocking operations can make this even more difficult > >>> because in that case even .atomic_commit() doesn't generally touch > >>> the hardware and instead that task is delegated to eg. a work queue. > >>> So by the time the hardware indicates an error the ioctl has long since > >>> returned to userspace. > >>> > >> This is my case as I send a page flip request to the backend > >> and later in time receive the corresponding response. > >> > >> Do you mind looking at my current WIP implementation of > >> atomic commit [1], [2]? This work is done as a preparation for > >> upstreaming the driver and is still in progress though. > > Usually one would handle page flips from the plane's .atomic_update() > > hook instead of the crtc's .atomic_flush(). But if you only have the one > > plane then this could work. But you should get at the plane fb via > > drm_atomic_get_new_plane_state() etc. > > > > Generally looks like you need some more work on the plane .atomic_check() > > function. Using drm_atomic_helper_check_plane_state() should get you > > most of the way there I'd imagine. Not sure if disabling the plane > > independently of the crtc makes any sense, but if not you should > > look at drm_simple_kms_helper.c for an example. Hmm. Not sure if you > > couldn't just use drm_simple_kms_helper.c yourself actually. > Great! As I am moving away from legacy stuff this is a > perfect fit now. So, I will base on drm_simple_kms_helper. > > BTW, it has a small issue which can be easily solved in the > DRM core I believe: if your driver needs to work with vblanks, > then you have to provide drm_driver.enable_vblank/disable_vblank > because drm_simple_kms_helper.drm_crtc_funcs do not provide any. > The problem here is that drm_driver.enable_vblank/disable_vblank > are most probably dummy functions and what is more they are marked > as deprecated. Hmm. Oh, I think someone just wanted people to move to using the corresponding hooks in the crtc funcs. I'm not sure how people do things when they don't have vblank interrupts, but I think in general if you don't initialize vblank support those hooks shouldn't get called. -- Ville Syrjälä Intel OTC _______________________________________________ dri-devel mailing list dri-devel@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/dri-devel